Search Sciweavers | Sciweavers

81 search results - page 13 / 17

» The Optimal Reward Baseline for Gradient-Based Reinforcement...

click to vote

IIE
2007

63views more IIE 2007»

Investigation of Q-Learning in the Context of a Virtual Learning Environment

14 years 11 months ago

Download www.mii.lt

We investigate the possibility to apply a known machine learning algorithm of Q-learning in the domain of a Virtual Learning Environment (VLE). It is important in this problem doma...

Dalia Baziukaite

claim paper

Read More »

111

click to vote

AGENTS
1999
Springer

126views Security Privacy» more AGENTS 1999»

General Principles of Learning-Based Multi-Agent Systems

15 years 4 months ago

Download web.engr.oregonstate.edu

We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a...

David Wolpert, Kevin R. Wheeler, Kagan Tumer

claim paper

Read More »

112

click to vote

ATAL
2010
Springer

146views Intelligent Agents» more ATAL 2010»

PAC-MDP learning with knowledge-based admissible models

14 years 12 months ago

Download www.aamas-conference.org

PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way which guarantees that with high probability, the algorithm pe...

Marek Grzes, Daniel Kudenko

claim paper

Read More »

Voted

ICML
2010
IEEE

258views Machine Learning» more ICML 2010»

Feature Selection as a One-Player Game

15 years 22 days ago

Download www.lri.fr

This paper formalizes Feature Selection as a Reinforcement Learning problem, leading to a provably optimal though intractable selection policy. As a second contribution, this pape...

Romaric Gaudel, Michèle Sebag

claim paper

Read More »

106

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

14 years 9 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

« Prev « First page 13 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers