Sciweavers

1176 search results - page 6 / 236
» Sparse reward processes
Sort
View
SAT
2004
Springer
89views Hardware» more  SAT 2004»
15 years 2 months ago
Using Rewarding Mechanisms for Improving Branching Heuristics
The variable branching heuristics used in the most recent and most effective SAT solvers, including zChaff and BerkMin, can be viewed as consisting of a simple mechanism for rewa...
Elsa Carvalho, João P. Marques Silva
IJCAI
2007
14 years 11 months ago
Bayesian Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...
Deepak Ramachandran, Eyal Amir
NIPS
2004
14 years 11 months ago
Experts in a Markov Decision Process
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
Eyal Even-Dar, Sham M. Kakade, Yishay Mansour
JAIR
2006
157views more  JAIR 2006»
14 years 9 months ago
Decision-Theoretic Planning with non-Markovian Rewards
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic...
Sylvie Thiébaux, Charles Gretton, John K. S...
ICML
2006
IEEE
15 years 10 months ago
An intrinsic reward mechanism for efficient exploration
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Özgür Simsek, Andrew G. Barto