Search Sciweavers | Sciweavers

513 search results - page 94 / 103

» Metric learning for reinforcement learning agents

Voted

JAIR
2011

187views more JAIR 2011»

A Monte-Carlo AIXI Approximation

14 years 7 months ago

Download www.hutter1.net

This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two ke...

Joel Veness, Kee Siong Ng, Marcus Hutter, William ...

claim paper

Read More »

119

Voted

ATAL
2006
Springer

127views Intelligent Agents» more ATAL 2006»

Learning to commit in repeated games

15 years 4 months ago

Download staff.science.uva.nl

Learning to converge to an efficient, i.e., Pareto-optimal Nash equilibrium of the repeated game is an open problem in multiagent learning. Our goal is to facilitate the learning ...

Stéphane Airiau, Sandip Sen

claim paper

Read More »

Voted

ATAL
2008
Springer

180views Intelligent Agents» more ATAL 2008»

On the usefulness of opponent modeling: the Kuhn Poker case study

15 years 2 months ago

Download www.ifaamas.org

The application of reinforcement learning algorithms to Partially Observable Stochastic Games (POSG) is challenging since each agent does not have access to the whole state inform...

Alessandro Lazaric, Mario Quaresimale, Marcello Re...

claim paper

Read More »

Voted

FLAIRS
2004

143views Artificial Intelligence» more FLAIRS 2004»

A New Filtering Model towards an Intelligent Guide Agent

15 years 1 months ago

Download www.aaai.org

In E-learning systems, where both helpers (tutors) and learners are separated geographically, finding a reliable helper is one of the most important challenges. Although helpers c...

Mohammed Abdel Razek, Claude Frasson, Marc Kaltenb...

claim paper

Read More »

Voted

ATAL
2003
Springer

123views Intelligent Agents» more ATAL 2003»

Team formation and communication restrictions in collectives

15 years 5 months ago

Download ti.arc.nasa.gov

A collective of agents often needs to maximize a “world utility” function which rates the performance of an entire system, while subject to communication restrictions among th...

Adrian K. Agogino, Kagan Tumer

claim paper

Read More »

« Prev « First page 94 / 103 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers