Sciweavers

AAAI
2008
13 years 6 months ago
Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiya...