Search Sciweavers | Sciweavers

132 search results - page 22 / 27

» Generalization in Reinforcement Learning: Safely Approximati...

click to vote

ICML
2008
IEEE

105views Machine Learning» more ICML 2008»

Learning all optimal policies with multiple criteria

16 years 12 days ago

Download leon.barrettnexus.com

We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...

Leon Barrett, Srini Narayanan

claim paper

Read More »

107

click to vote

ICML
2005
IEEE

123views Machine Learning» more ICML 2005»

A model for handling approximate, noisy or incomplete labeling in text classification

16 years 12 days ago

Download www.cse.iitb.ac.in

We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of...

Ganesh Ramakrishnan, Krishna Prasad Chitrapura, Ra...

claim paper

Read More »

click to vote

GECCO
2009
Springer

200views Optimization» more GECCO 2009»

Apply ant colony optimization to Tetris

15 years 6 months ago

Download cs.nju.edu.cn

Tetris is a falling block game where the player’s objective is to arrange a sequence of diﬀerent shaped tetrominoes smoothly in order to survive. In the intelligence games, ag...

Xingguo Chen, Hao Wang, Weiwei Wang, Yinghuan Shi,...

claim paper

Read More »

100

click to vote

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

16 years 12 days ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

click to vote

ICML
2008
IEEE

117views Machine Learning» more ICML 2008»

Sample-based learning and search with permanent and transient memories

16 years 12 days ago

Download www.cs.ualberta.ca

We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...

David Silver, Martin Müller 0003, Richard S. ...

claim paper

Read More »

« Prev « First page 22 / 27 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers