Search Sciweavers | Sciweavers

168 search results - page 24 / 34

» Optimism in Reinforcement Learning Based on Kullback-Leibler...

click to vote

JAIR
2011

144views more JAIR 2011»

Non-Deterministic Policies in Markovian Decision Processes

14 years 4 months ago

Download www.jair.org

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making proble...

Mahdi Milani Fard, Joelle Pineau

claim paper

Read More »

click to vote

ESANN
2007

122views Neural Networks» more ESANN 2007»

The Recurrent Control Neural Network

14 years 11 months ago

Download www.dice.ucl.ac.be

This paper presents our Recurrent Control Neural Network (RCNN), which is a model-based approach for a data-eﬃcient modelling and control of reinforcement learning problems in di...

Anton Maximilian Schäfer, Steffen Udluft, Han...

claim paper

Read More »

121

Voted

SIGDIAL
2010

137views Natural Language Processing» more SIGDIAL 2010»

Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy

14 years 7 months ago

Download mastarpj.nict.go.jp

This paper presents a spoken dialogue framework that helps users in making decisions. Users often do not have a definite goal or criteria for selecting from a list of alternatives...

Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chi...

claim paper

Read More »

click to vote

KDD
2004
ACM

158views Data Mining» more KDD 2004»

A generalized maximum entropy approach to bregman co-clustering and matrix approximation

15 years 10 months ago

Download www.ideal.ece.utexas.edu

Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic ...

Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...

claim paper

Read More »

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

14 years 7 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

« Prev « First page 24 / 34 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers