Search Sciweavers | Sciweavers

829 search results - page 15 / 166

» A time aggregation approach to Markov decision processes

Voted

IJCAI
1997

115views Artificial Intelligence» more IJCAI 1997»

Aggregating Features and Matching Cases on Vague Linguistic Expressions

15 years 3 months ago

Download dli.iiit.ac.in

Decision making based on the comparison of multiple criteria of two or more alternatives, is the subject of intensive research. In many decision making situations, a single criter...

Alfons Schuster, Werner Dubitzky, Philippe Lopes, ...

claim paper

Read More »

Voted

COLT
2000
Springer

87views Machine Learning» more COLT 2000»

Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning

15 years 6 months ago

Download www.cs.iastate.edu

We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...

Peter L. Bartlett, Jonathan Baxter

claim paper

Read More »

129

Voted

AAAI
2011

145views Intelligent Agents» more AAAI 2011»

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

14 years 1 months ago

Download www.cs.ubc.ca

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...

Mark Crowley, David Poole

claim paper

Read More »

click to vote

ICALP
2009
Springer

92views Programming Languages» more ICALP 2009»

Reachability in Stochastic Timed Games

16 years 2 months ago

Download www.lsv.ens-cachan.fr

We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...

Patricia Bouyer, Vojtech Forejt

claim paper

Read More »

113

Voted

ML
2002
ACM

121views Machine Learning» more ML 2002»

Near-Optimal Reinforcement Learning in Polynomial Time

15 years 1 months ago

Download www.cis.upenn.edu

We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

« Prev « First page 15 / 166 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers