Sciweavers

679 search results - page 92 / 136
» Approximate Temporal Aggregation
Sort
View
118
Voted
IAT
2005
IEEE
15 years 6 months ago
Decomposing Large-Scale POMDP Via Belief State Analysis
Partially observable Markov decision process (POMDP) is commonly used to model a stochastic environment with unobservable states for supporting optimal decision making. Computing ...
Xin Li, William K. Cheung, Jiming Liu
98
Voted
ATAL
2004
Springer
15 years 6 months ago
Decentralized Markov Decision Processes with Event-Driven Interactions
Decentralized MDPs provide a powerful formal framework for planning in multi-agent systems, but the complexity of the model limits its usefulness. We study in this paper a class o...
Raphen Becker, Shlomo Zilberstein, Victor R. Lesse...
ECML
2004
Springer
15 years 6 months ago
Filtered Reinforcement Learning
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
Douglas Aberdeen
91
Voted
ICPP
1997
IEEE
15 years 5 months ago
Communication in Parallel Applications: Characterization and Sensitivity Analysis
Communication characterization of parallel applications is essential to understand the interplay between architectures and applications in determining the maximum achievable perfo...
Dale Seed, Anand Sivasubramaniam, Chita R. Das
106
Voted
ECAI
2006
Springer
15 years 4 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani