Search Sciweavers | Sciweavers

162 search results - page 12 / 33

» Off-Policy Temporal Difference Learning with Function Approx...

click to vote

PKDD
2009
Springer

169views Data Mining» more PKDD 2009»

Hybrid Least-Squares Algorithms for Approximate Policy Evaluation

15 years 6 months ago

Download www.cs.umass.edu

The goal of approximate policy evaluation is to “best” represent a target value function according to a speciﬁc criterion. Temporal difference methods and Bellman residual m...

Jeffrey Johns, Marek Petrik, Sridhar Mahadevan

claim paper

Read More »

click to vote

AAAI
2008

207views Intelligent Agents» more AAAI 2008»

Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation

15 years 2 months ago

Download sugiyama-www.cs.titech.ac.jp

Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...

Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiya...

claim paper

Read More »

click to vote

ICMLA
2008

195views Machine Learning» more ICMLA 2008»

Basis Function Construction in Reinforcement Learning Using Cascade-Correlation Learning Architecture

15 years 1 months ago

Download www.grappa.univ-lille3.fr

In reinforcement learning, it is a common practice to map the state(-action) space to a different one using basis functions. This transformation aims to represent the input data i...

Sertan Girgin, Philippe Preux

claim paper

Read More »

104

click to vote

SOFTWARE
2002

110views Applied Computing» more SOFTWARE 2002»

Temporal Probabilistic Concepts from Heterogeneous Data Sequences

14 years 11 months ago

Download www.infj.ulst.ac.uk

We consider the problem of characterisation of sequences of heterogeneous symbolic data that arise from a common underlying temporal pattern. The data, which are subject to impreci...

Sally I. McClean, Bryan W. Scotney, Fiona Palmer

claim paper

Read More »

189

click to vote

EH
1999
IEEE

351views Hardware» more EH 1999»

Evolvable Hardware or Learning Hardware? Induction of State Machines from Temporal Logic Constraints

15 years 4 months ago

Download web.cecs.pdx.edu

Here we advocate an approach to learning hardware based on induction of finite state machines from temporal logic constraints. The method involves training on examples, constraint...

Marek A. Perkowski, Alan Mishchenko, Anatoli N. Ch...

claim paper

Read More »

« Prev « First page 12 / 33 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers