Search Sciweavers | Sciweavers

1176 search results - page 6 / 236

» Sparse reward processes

152

click to vote

SAT
2004
Springer

89views Hardware» more SAT 2004»

Using Rewarding Mechanisms for Improving Branching Heuristics

16 years 1 months ago

Download www.satisfiability.org

The variable branching heuristics used in the most recent and most eﬀective SAT solvers, including zChaﬀ and BerkMin, can be viewed as consisting of a simple mechanism for rewa...

Elsa Carvalho, João P. Marques Silva

claim paper

Read More »

225

click to vote

IJCAI
2007

254views Artificial Intelligence» more IJCAI 2007»

Bayesian Inverse Reinforcement Learning

15 years 9 months ago

Download www.ijcai.org

Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...

Deepak Ramachandran, Eyal Amir

claim paper

Read More »

214

click to vote

NIPS
2004

103views Information Technology» more NIPS 2004»

Experts in a Markov Decision Process

15 years 9 months ago

Download books.nips.cc

We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...

Eyal Even-Dar, Sham M. Kakade, Yishay Mansour

claim paper

Read More »

249

click to vote

JAIR
2006

157views more JAIR 2006»

Decision-Theoretic Planning with non-Markovian Rewards

15 years 7 months ago

Download www.jair.org

A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic...

Sylvie Thiébaux, Charles Gretton, John K. S...

claim paper

Read More »

193

click to vote

ICML
2006
IEEE

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

16 years 8 months ago

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...

Özgür Simsek, Andrew G. Barto

claim paper

Read More »

« Prev « First page 6 / 236 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers