Sciweavers

1227 search results - page 52 / 246
» Learning Rates for Q-Learning
Sort
View
ICML
2003
IEEE
15 years 10 months ago
TD(0) Converges Provably Faster than the Residual Gradient Algorithm
In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...
Ralf Schoknecht, Artur Merke
ACL
2011
14 years 1 months ago
Semi-supervised latent variable models for sentence-level sentiment analysis
We derive two variants of a semi-supervised model for fine-grained sentiment analysis. Both models leverage abundant natural supervision in the form of review ratings, as well as...
Oscar Täckström, Ryan T. McDonald
SIGIR
2011
ACM
14 years 17 days ago
Fast context-aware recommendations with factorization machines
The situation in which a choice is made is an important information for recommender systems. Context-aware recommenders take this information into account to make predictions. So ...
Steffen Rendle, Zeno Gantner, Christoph Freudentha...
ICML
2006
IEEE
15 years 10 months ago
Relational temporal difference learning
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Nima Asgharbeygi, David J. Stracuzzi, Pat Langley
IJCNN
2006
IEEE
15 years 3 months ago
Learning a Rendezvous Task with Dynamic Joint Action Perception
Abstract— Groups of reinforcement learning agents interacting in a common environment often fail to learn optimal behaviors. Poor performance is particularly common in environmen...
Nancy Fulda, Dan Ventura