Sciweavers

4013 search results - page 482 / 803
» computer 2002
Sort
View
ML
2002
ACM
100views Machine Learning» more  ML 2002»
15 years 5 months ago
Structure in the Space of Value Functions
Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental ...
David J. Foster, Peter Dayan
ML
2002
ACM
121views Machine Learning» more  ML 2002»
15 years 5 months ago
Near-Optimal Reinforcement Learning in Polynomial Time
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...
Michael J. Kearns, Satinder P. Singh
ML
2002
ACM
168views Machine Learning» more  ML 2002»
15 years 5 months ago
On Average Versus Discounted Reward Temporal-Difference Learning
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
John N. Tsitsiklis, Benjamin Van Roy
151
Voted
MP
2002
84views more  MP 2002»
15 years 5 months ago
A decomposition procedure based on approximate Newton directions
The efficient solution of large-scale linear and nonlinear optimization problems may require exploiting any special structure in them in an efficient manner. We describe and analy...
Antonio J. Conejo, Francisco J. Nogales, Francisco...
MP
2002
85views more  MP 2002»
15 years 5 months ago
Generalized Goal Programming: polynomial methods and applications
In this paper we address a general Goal Programming problem with linear objectives, convex constraints, and an arbitrary componentwise nondecreasing norm to aggregate deviations w...
Emilio Carrizosa, Jörg Fliege