In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Gradient-following learning methods can encounter problems of implementation in many applications, and stochastic variants are frequently used to overcome these difficulties. We ...
State estimation consists of updating an agent’s belief given executed actions and observed evidence to date. In single agent environments, the state estimation can be formalize...
Traditional hop-by-hop dynamic routing makes inefficient use of network resources as it forwards packets along already congested shortest paths while uncongested longer paths may b...
Minsoo Lee, Xiaohui Ye, Dan Marconett, Samuel John...
Abstract. This paper reports student interaction patterns and self-reported results of using Twitter microblogging environment. The study employs longitudinal probabilistic social ...