We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evalua...
John Langford, Alexander L. Strehl, Jennifer Wortm...
Abstract. The utilization of pseudo-random proportional rule to balance between the exploitation and exploration of the search process was shown in Ant Colony System (ACS) algorith...
Existing keyword suggestion tools from various search engine companies could automatically suggest keywords related to the advertisers' products or services, counting in simp...
Gang Wang, Jian Hu, Yunzhang Zhu, Hua Li, Zheng Ch...
—“Big Data” in map-reduce (M-R) clusters is often fundamentally temporal in nature, as are many analytics tasks over such data. For instance, display advertising uses Behavio...
Badrish Chandramouli, Jonathan Goldstein, Songyun ...
Abstract--In this paper, d-AdaptOR, a distributed opportunistic routing scheme for multi-hop wireless ad-hoc networks is proposed. The proposed scheme utilizes a reinforcement lear...
Abhijeet Bhorkar, Mohammad Naghshvar, Tara Javidi,...