Search Sciweavers | Sciweavers

16 search results - page 3 / 4

» Bounded Parameter Markov Decision Processes with Average Rew...

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

13 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

click to vote

CORR
2008
Springer

189views Education» more CORR 2008»

Algorithms for Dynamic Spectrum Access with Learning for Cognitive Radio

13 years 5 months ago

Download www.ifp.illinois.edu

We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperati...

Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli

claim paper

Read More »

click to vote

NECO
2007

150views more NECO 2007»

Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule

13 years 5 months ago

Download eprints.pascal-network.org

Learning agents, whether natural or artiﬁcial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...

Dorit Baras, Ron Meir

claim paper

Read More »

click to vote

CDC
2008
IEEE

197views Control Systems» more CDC 2008»

Dynamic spectrum access policies for cognitive radio

14 years 7 days ago

Download www.ifp.illinois.edu

—We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooper...

Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli

claim paper

Read More »

click to vote

IPCO
2010

125views Optimization» more IPCO 2010»

A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games with Perfect Information

13 years 7 months ago

Download www.mpi-inf.mpg.de

Abstract. We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V = VB VW VR, E), with local rewards r : E R...

Endre Boros, Khaled M. Elbassioni, Vladimir Gurvic...

claim paper

Read More »

« Prev « First page 3 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers