Sciweavers

451 search results - page 3 / 91
» Temporal Rewards for Performance Evaluation
Sort
View
JAIR
2006
157views more  JAIR 2006»
13 years 5 months ago
Decision-Theoretic Planning with non-Markovian Rewards
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic...
Sylvie Thiébaux, Charles Gretton, John K. S...
VALUETOOLS
2006
ACM
164views Hardware» more  VALUETOOLS 2006»
13 years 11 months ago
Analysis of Markov reward models using zero-suppressed multi-terminal BDDs
High-level stochastic description methods such as stochastic Petri nets, stochastic UML statecharts etc., together with specifications of performance variables (PVs), enable a co...
Kai Lampka, Markus Siegle
AAAI
2006
13 years 7 months ago
QUICR-Learning for Multi-Agent Coordination
Coordinating multiple agents that need to perform a sequence of actions to maximize a system level reward requires solving two distinct credit assignment problems. First, credit m...
Adrian K. Agogino, Kagan Tumer
CHI
2010
ACM
14 years 8 days ago
Physical activity motivating games: virtual rewards for real activity
Contemporary lifestyle has become increasingly sedentary: little physical (sports, exercises) and much sedentary (TV, computers) activity. The nature of sedentary activity is self...
Shlomo Berkovsky, Mac Coombe, Jill Freyne, Dipak B...
AAAI
2006
13 years 7 months ago
Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to a...
Andrea Lockerd Thomaz, Cynthia Breazeal