Search Sciweavers | Sciweavers

1799 search results - page 209 / 360

» Filtered Reinforcement Learning

147

click to vote

ANOR
2005

80views more ANOR 2005»

Entropic Penalties in Finite Games

15 years 4 months ago

Download www.science.unitn.it

The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...

Sjur Didrik Flåm, E. Cavazzuti

claim paper

Read More »

136

click to vote

ALIFE
2002

176views Modeling And Simulation» more ALIFE 2002»

Ant Colony Optimization and Stochastic Gradient Descent

15 years 4 months ago

Download ti.arc.nasa.gov

In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empir...

Nicolas Meuleau, Marco Dorigo

claim paper

Read More »

162

click to vote

AGI
2011

222views Artificial Intelligence» more AGI 2011»

Measuring Agent Intelligence via Hierarchies of Environments

14 years 8 months ago

Download www.ssec.wisc.edu

Under Legg’s and Hutter’s formal measure [1], performance in easy environments counts more toward an agent’s intelligence than does performance in difficult environments. An ...

Bill Hibbard

claim paper

Read More »

169

click to vote

CORR
2006
Springer

140views Education» more CORR 2006»

Nearly optimal exploration-exploitation decision thresholds

15 years 4 months ago

Download www.idiap.ch

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...

Christos Dimitrakakis

posted by olethros

Read More »

175

click to vote

ML
2010
ACM

135views Machine Learning» more ML 2010»

Multi-domain learning by confidence-weighted parameter combination

14 years 11 months ago

Download www.cs.jhu.edu

State-of-the-art statistical NLP systems for a variety of tasks learn from labeled training data that is often domain specific. However, there may be multiple domains or sources o...

Mark Dredze, Alex Kulesza, Koby Crammer

claim paper

Read More »

« Prev « First page 209 / 360 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers