Search Sciweavers | Sciweavers

2040 search results - page 312 / 408

» Approximate Expectation Maximization

123

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 4 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

121

Voted

AAAI
1998

175views Intelligent Agents» more AAAI 1998»

Bayesian Q-Learning

15 years 4 months ago

Download www.aaai.org

A central problem in learning in complex environmentsis balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of explora...

Richard Dearden, Nir Friedman, Stuart J. Russell

claim paper

Read More »

110

click to vote

NIPS
2000

91views Information Technology» more NIPS 2000»

High-temperature Expansions for Learning Models of Nonnegative Data

15 years 4 months ago

Download www.analyticalinsights.com

Recent work has exploited boundedness of data in the unsupervised learning of new types of generative model. For nonnegative data it was recently shown that the maximum-entropy ge...

Oliver B. Downs

claim paper

Read More »

128

Voted

WSC
2000

110views Modeling And Simulation» more WSC 2000»

Simulation optimization of stochastic systems with integer variables by sequential linearization

15 years 4 months ago

Download www.informs-sim.org

Discrete-event simulation is widely used to analyse and improve the performance of manufacturing systems. The related optimization problem often includes integer design variables ...

S. J. Abspoel, L. F. P. Etman, J. Vervoort, J. E. ...

claim paper

Read More »

112

click to vote

IJCAI
1989

110views Artificial Intelligence» more IJCAI 1989»

A Model for Projection and Action

15 years 4 months ago

Download dli.iiit.ac.in

In designing autonomous agents that deal competently with issues involving time and space, there is a tradeoff to be made between guaranteed response-time reactions on the one han...

Keiji Kanazawa, Thomas Dean

claim paper

Read More »

« Prev « First page 312 / 408 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers