Sciweavers

17 search results - page 2 / 4
» Analysis of a Classification-based Policy Iteration Algorith...
Sort
View
INFOCOM
2007
IEEE
13 years 11 months ago
Near-Optimal Data Dissemination Policies for Multi-Channel, Single Radio Wireless Sensor Networks
Abstract—We analyze the performance limits of data dissemination with multi-channel, single radio sensors. We formulate the problem of minimizing the average delay of data dissem...
David Starobinski, Weiyao Xiao, Xiangping Qin, Ari...
ICONIP
2009
13 years 3 months ago
Tracking in Reinforcement Learning
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
ICML
2010
IEEE
13 years 6 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
CORR
2010
Springer
170views Education» more  CORR 2010»
13 years 5 months ago
Global Optimization for Value Function Approximation
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...
Marek Petrik, Shlomo Zilberstein
JCDL
2005
ACM
161views Education» more  JCDL 2005»
13 years 11 months ago
Downloading textual hidden web content through keyword queries
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...
Alexandros Ntoulas, Petros Zerfos, Junghoo Cho