In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
Abstract--Cooperative communications have been demonstrated to be effective in combating the multiple fading effects in wireless networks, and improving the network performance in ...
Xuedong Liang, Ilangko Balasingham, Victor C. M. L...
Ad-hoc Grids are highly heterogeneous and dynamic networks, one of the main challenges of resource allocation in such environments is to find mechanisms which do not rely on the ...
—We present a new algorithm for vertical handover and dynamic network selection, based on a combination of multiattribute utility theory, kernel learning and stochastic gradient ...
Eric van den Berg, Praveen Gopalakrishnan, Byungsu...
— Hierarchical state machines have proven to be a powerful tool for controlling autonomous robots due to their flexibility and modularity. For most real robot implementations, h...