Sciweavers

17 search results - page 2 / 4
» Analysis of a Classification-based Policy Iteration Algorith...
Sort
View
INFOCOM
2007
IEEE
15 years 4 months ago
Near-Optimal Data Dissemination Policies for Multi-Channel, Single Radio Wireless Sensor Networks
Abstract—We analyze the performance limits of data dissemination with multi-channel, single radio sensors. We formulate the problem of minimizing the average delay of data dissem...
David Starobinski, Weiyao Xiao, Xiangping Qin, Ari...
ICONIP
2009
14 years 7 months ago
Tracking in Reinforcement Learning
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
ICML
2010
IEEE
14 years 10 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
CORR
2010
Springer
170views Education» more  CORR 2010»
14 years 9 months ago
Global Optimization for Value Function Approximation
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...
Marek Petrik, Shlomo Zilberstein
JCDL
2005
ACM
161views Education» more  JCDL 2005»
15 years 3 months ago
Downloading textual hidden web content through keyword queries
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...
Alexandros Ntoulas, Petros Zerfos, Junghoo Cho