Sciweavers

2011 search results - page 240 / 403
» Universal Reinforcement Learning
Sort
View
144
Voted
ECML
2007
Springer
15 years 11 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ICCS
1993
Springer
15 years 9 months ago
Towards Domain-Independent Machine Intelligence
Adaptive predictive search (APS), is a learning system framework, which given little initial domain knowledge, increases its decision-making abilities in complex problems domains....
Robert Levinson
NIPS
2008
15 years 6 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
ICML
2004
IEEE
16 years 5 months ago
Communication complexity as a lower bound for learning in games
A fast-growing body of research in the AI and machine learning communities addresses learning in games, where there are multiple learners with different interests. This research a...
Vincent Conitzer, Tuomas Sandholm
ETS
2000
IEEE
126views Hardware» more  ETS 2000»
15 years 4 months ago
Dynamic Goal-Based Role-Play Simulation on the Web: A Case Study
This paper outlines and discusses the pedagogical approach, the technical design architecture, and an innovative implementation of a collaborative role-play simulation technology ...
Som Naidu, Albert Ip, Roni Linser