Sciweavers

115 search results - page 9 / 23
» Recurrent policy gradients
Sort
View
AAAI
2010
15 years 1 months ago
Multi-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...
Chongjie Zhang, Victor R. Lesser
FLAIRS
2004
15 years 1 months ago
Recurrent Neural Networks and Pitch Representations for Music Tasks
We present results from experiments in using several pitch representations for jazz-oriented musical tasks performed by a recurrent neural network. We have run experiments with se...
Judy A. Franklin
NN
1998
Springer
108views Neural Networks» more  NN 1998»
14 years 11 months ago
How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NA...
Tsungnan Lin, Bill G. Horne, C. Lee Giles
ICML
2009
IEEE
16 years 16 days ago
Monte-Carlo simulation balancing
In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
David Silver, Gerald Tesauro
ORL
2008
68views more  ORL 2008»
14 years 11 months ago
On polynomial cases of the unichain classification problem for Markov Decision Processes
The unichain classification problem detects whether a finite state and action MDP is unichain under all deterministic policies. This problem is NP-hard [11]. This paper provides p...
Eugene A. Feinberg, Fenghsu Yang