Search Sciweavers | Sciweavers

575 search results - page 39 / 115

» Reinforcement Learning State Estimator

click to vote

ICML
1999
IEEE

129views Machine Learning» more ICML 1999»

Implicit Imitation in Multiagent Reinforcement Learning

15 years 10 months ago

Download www.cs.toronto.edu

Imitation is actively being studied as an effective means of learning in multi-agent environments. It allows an agent to learn how to act well (perhaps optimally) by passively obs...

Bob Price, Craig Boutilier

claim paper

Read More »

101

click to vote

EWRL
2008

186views Machine Learning» more EWRL 2008»

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case

14 years 11 months ago

Download webee.technion.ac.il

We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...

Kirill Dyagilev, Shie Mannor, Nahum Shimkin

claim paper

Read More »

click to vote

ECAI
2010
Springer

238views Artificial Intelligence» more ECAI 2010»

The Dynamics of Multi-Agent Reinforcement Learning

14 years 11 months ago

Download www.doc.ic.ac.uk

Abstract. Infinite-horizon multi-agent control processes with nondeterminism and partial state knowledge have particularly interesting properties with respect to adaptive control, ...

Luke Dickens, Krysia Broda, Alessandra Russo

claim paper

Read More »

click to vote

ML
2002
ACM

121views Machine Learning» more ML 2002»

Near-Optimal Reinforcement Learning in Polynomial Time

14 years 9 months ago

Download www.cis.upenn.edu

We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

click to vote

ECML
2004
Springer

100views Machine Learning» more ECML 2004»

Dynamic Asset Allocation Exploiting Predictors in Reinforcement Learning Framework

15 years 3 months ago

Download bi.snu.ac.kr

Given the pattern-based multi-predictors of the stock price, we study a method of dynamic asset allocation to maximize the trading performance. To optimize the proportion of asset ...

Jangmin O, Jae Won Lee, Jongwoo Lee, Byoung-Tak Zh...

claim paper

Read More »

« Prev « First page 39 / 115 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers