Sciweavers

115 search results - page 9 / 23
» Recurrent policy gradients
Sort
View
AAAI
2010
14 years 11 months ago
Multi-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...
Chongjie Zhang, Victor R. Lesser
81
Voted
FLAIRS
2004
14 years 11 months ago
Recurrent Neural Networks and Pitch Representations for Music Tasks
We present results from experiments in using several pitch representations for jazz-oriented musical tasks performed by a recurrent neural network. We have run experiments with se...
Judy A. Franklin
NN
1998
Springer
108views Neural Networks» more  NN 1998»
14 years 9 months ago
How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NA...
Tsungnan Lin, Bill G. Horne, C. Lee Giles
64
Voted
ICML
2009
IEEE
15 years 10 months ago
Monte-Carlo simulation balancing
In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
David Silver, Gerald Tesauro
ORL
2008
68views more  ORL 2008»
14 years 9 months ago
On polynomial cases of the unichain classification problem for Markov Decision Processes
The unichain classification problem detects whether a finite state and action MDP is unichain under all deterministic policies. This problem is NP-hard [11]. This paper provides p...
Eugene A. Feinberg, Fenghsu Yang