Sciweavers

95 search results - page 17 / 19
» Policy Gradients for Cryptanalysis
Sort
View
NIPS
2003
14 years 11 months ago
Extending Q-Learning to General Adaptive Multi-Agent Systems
Recent multi-agent extensions of Q-Learning require knowledge of other agents’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This pap...
Gerald Tesauro
ICONIP
2007
14 years 11 months ago
Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents
The aim of the Cyber Rodent project [1] is to elucidate the origin of our reward and affective systems by building artificial agents that share the natural biological constraints...
Eiji Uchibe, Kenji Doya
121
Voted
SIGDIAL
2010
14 years 7 months ago
Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy
This paper presents a spoken dialogue framework that helps users in making decisions. Users often do not have a definite goal or criteria for selecting from a list of alternatives...
Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chi...
ACL
2009
14 years 7 months ago
Reinforcement Learning for Mapping Instructions to Actions
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
84
Voted
EDBT
2008
ACM
144views Database» more  EDBT 2008»
15 years 9 months ago
BI batch manager: a system for managing batch workloads on enterprise data-warehouses
Modern enterprise data warehouses have complex workloads that are notoriously difficult to manage. An important problem in workload management is to run these complex workloads `o...
Abhay Mehta, Chetan Gupta, Umeshwar Dayal