Search Sciweavers | Sciweavers

252 search results - page 14 / 51

» Learning Partially Observable Action Models: Efficient Algor...

145

click to vote

EWRL
2008

186views Machine Learning» more EWRL 2008»

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case

15 years 4 months ago

Download webee.technion.ac.il

We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...

Kirill Dyagilev, Shie Mannor, Nahum Shimkin

claim paper

Read More »

130

click to vote

TSMC
2008

132views more TSMC 2008»

Ensemble Algorithms in Reinforcement Learning

15 years 2 months ago

Download people.cs.uu.nl

This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and fin...

Marco A. Wiering, Hado van Hasselt

claim paper

Read More »

128

click to vote

PERCOM
2007
ACM

189views Computer Networks» more PERCOM 2007»

Sensor Scheduling for Optimal Observability Using Estimation Entropy

16 years 2 months ago

Download people.eng.unimelb.edu.au

We consider sensor scheduling as the optimal observability problem for partially observable Markov decision processes (POMDP). This model fits to the cases where a Markov process ...

Mohammad Rezaeian

claim paper

Read More »

124

click to vote

CORR
2011
Springer

161views Education» more CORR 2011»

Doubly Robust Policy Evaluation and Learning

14 years 6 months ago

Download www.icml-2011.org

We study decision making in environments where the reward is only partially observed, but can be modeled as a function of an action and an observed context. This setting, known as...

Miroslav Dudík, John Langford, Lihong Li

claim paper

Read More »

139

click to vote

ICASSP
2008
IEEE

215views Signal Processing» more ICASSP 2008»

Bayesian update of dialogue state for robust dialogue systems

15 years 9 months ago

Download mi.eng.cam.ac.uk

This paper presents a new framework for accumulating beliefs in spoken dialogue systems. The technique is based on updating a Bayesian Network that represents the underlying state...

Blaise Thomson, Jost Schatzmann, Steve Young

claim paper

Read More »

« Prev « First page 14 / 51 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers