Search Sciweavers | Sciweavers

64 search results - page 5 / 13

» Multi-Agent Learning with Policy Prediction

204

click to vote

MLMTA
2003

153views Machine Learning» more MLMTA 2003»

Using a Two-Layered Case-Based Reasoning for Prediction in Soccer Coach

15 years 6 months ago

Download ce.sharif.edu

Abstract— The prediction of the future states in MultiAgent Systems has been a challenging problem since the begining of MAS. Robotic soccer is a MAS environment in which the pre...

Mazda Ahmadi, Abolfazl Keighobadi Lamjiri, Mayssam...

claim paper

Read More »

140

click to vote

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 5 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

155

Voted

AAAI
2008

169views Intelligent Agents» more AAAI 2008»

Perpetual Learning for Non-Cooperative Multiple Agents

15 years 7 months ago

Download www.aaai.org

This paper examines, by argument, the dynamics of sequences of behavioural choices made, when non-cooperative restricted-memory agents learn in partially observable stochastic gam...

Luke Dickens

claim paper

Read More »

130

click to vote

ICMLA
2008

130views Machine Learning» more ICMLA 2008»

A Predictive Model for Imitation Learning in Partially Observable Environments

15 years 6 months ago

Download www.damas.ift.ulaval.ca

Learning by imitation has shown to be a powerful paradigm for automated learning in autonomous robots. This paper presents a general framework of learning by imitation for stochas...

Abdeslam Boularias

claim paper

Read More »

191

click to vote

JMLR
2012

200views Programming Languages» more JMLR 2012»

Contextual Bandit Learning with Predictable Rewards

13 years 7 months ago

Download www.cs.princeton.edu

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...

Alekh Agarwal, Miroslav Dudík, Satyen Kale,...

claim paper

Read More »

« Prev « First page 5 / 13 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers