Search Sciweavers | Sciweavers

417 search results - page 24 / 84

» Reinforcement Learning Estimation of Distribution Algorithm

102

click to vote

ICML
2006
IEEE

117views Machine Learning» more ICML 2006»

16 years 3 months ago

Estimating relatedness via data compression

Download people.csail.mit.edu

We show that it is possible to use data compression on independently obtained hypotheses from various tasks to algorithmically provide guarantees that the tasks are sufficiently r...

Brendan Juba

claim paper

Read More »

100

Voted

ICML
2001
IEEE

132views Machine Learning» more ICML 2001»

Expectation Maximization for Weakly Labeled Data

16 years 3 months ago

Download characters.media.mit.edu

We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label "guessed" by the learning algorithm - a situation comm...

Yuri A. Ivanov, Bruce Blumberg, Alex Pentland

claim paper

Read More »

128

Voted

ATAL
2004
Springer

197views Intelligent Agents» more ATAL 2004»

Adaptive, Distributed Control of Constrained Multi-Agent Systems

15 years 7 months ago

Download collectives.stanford.edu

Product Distribution (PD) theory was recently developed as a framework for analyzing and optimizing distributed systems. In this paper we demonstrate its use for adaptive distribu...

Stefan Bieniawski, David Wolpert

claim paper

Read More »

141

click to vote

CORR
2010
Springer

204views Education» more CORR 2010»

Predictive State Temporal Difference Learning

15 years 25 days ago

Download www.cs.cmu.edu

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identiﬁcation. In practical applications...

Byron Boots, Geoffrey J. Gordon

claim paper

Read More »

141

click to vote

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

15 years 7 months ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

« Prev « First page 24 / 84 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers