A game-theoretic approach for learning optimal parameter values for probabilistic rough set regions is presented. The parameters can be used to define approximation regions in a p...
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
Abstract. We present the first (to our knowledge) approximation algorithm for tensor clustering—a powerful generalization to basic 1D clustering. Tensors are increasingly common...
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...
The performance of a learning classifier system is due to its two main components. First, it evolves new structures by generating new rules in a genetic process; second, it adjust...