Sciweavers

270 search results - page 37 / 54
» Estimation of non-stationary Markov Chain transition models
Sort
View
CORR
2010
Springer
105views Education» more  CORR 2010»
14 years 10 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
CVPR
2001
IEEE
16 years 1 months ago
Texture Replacement in Real Images
Texture replacement in real images has many applications, such as interior design, digital movie making and computer graphics. The goal is to replace some specified texture patter...
Yanghai Tsin, Yanxi Liu, Visvanathan Ramesh
AIPS
2007
15 years 2 months ago
Learning to Plan Using Harmonic Analysis of Diffusion Models
This paper summarizes research on a new emerging framework for learning to plan using the Markov decision process model (MDP). In this paradigm, two approaches to learning to plan...
Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns,...
ICML
1999
IEEE
16 years 18 days ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
SPAA
1990
ACM
15 years 3 months ago
Analysis of Multithreaded Architectures for Parallel Computing
Multithreading has been proposed as an architectural strategy for tolerating latency in multiprocessors and, through limited empirical studies, shown to offer promise. This paper ...
Rafael H. Saavedra-Barrera, David E. Culler, Thors...