In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
If V is a set of n points in the unit square [0, 1]2 , and if R : V + is an assignment of positive real numbers (radii) to to those points, define a graph G(R) as follows: {v, w}...
Abstract. We address the problem of continuous stochastic optimal control in the presence of hard obstacles. Due to the non-smooth character of the obstacles, the traditional appro...
Similarity-based Logic Programming (briefly, SLP) has been proposed to enhance the LP paradigm with a kind of approximate reasoning which supports flexible information retrieval a...
We present a deterministic approximation algorithm to compute logarithm of the number of `good' truth assignments for a random k-satisfiability (k-SAT) formula in polynomial ...