Sciweavers

1235 search results - page 247 / 247
» Reinforcement learning in a nutshell
Sort
View
JMLR
2006
124views more  JMLR 2006»
13 years 4 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
JSAC
2007
189views more  JSAC 2007»
13 years 4 months ago
Non-Cooperative Power Control for Wireless Ad Hoc Networks with Repeated Games
— One of the distinctive features in a wireless ad hoc network is lack of any central controller or single point of authority, in which each node/link then makes its own decision...
Chengnian Long, Qian Zhang, Bo Li, Huilong Yang, X...
JSAC
2010
129views more  JSAC 2010»
13 years 3 months ago
An adaptive link layer for heterogeneous multi-radio mobile sensor networks
—An important challenge in mobile sensor networks is to enable energy-efficient communication over a diversity of distances while being robust to wireless effects caused by node...
Jeremy Gummeson, Deepak Ganesan, Mark D. Corner, P...
QRE
2010
129views more  QRE 2010»
13 years 3 months ago
Improving quality of prediction in highly dynamic environments using approximate dynamic programming
In many applications, decision making under uncertainty often involves two steps- prediction of a certain quality parameter or indicator of the system under study and the subseque...
Rajesh Ganesan, Poornima Balakrishna, Lance Sherry
SIGIR
2011
ACM
12 years 7 months ago
Social context summarization
We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on extracting informative sentences from standard docume...
Zi Yang, Keke Cai, Jie Tang, Li Zhang, Zhong Su, J...