Sciweavers

95 search results - page 19 / 19
» Policy Gradients for Cryptanalysis
Sort
View
ECML
2004
Springer
13 years 10 months ago
Filtered Reinforcement Learning
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
Douglas Aberdeen
CN
2006
74views more  CN 2006»
13 years 5 months ago
Measurement-based optimal routing on overlay architectures for unicast sessions
We propose a measurement-based routing algorithm to load-balance intradomain traffic along multiple paths for multiple unicast sources. Multiple paths are established using overla...
Tuna Güven, Richard J. La, Mark A. Shayman, B...
ASPDAC
2010
ACM
168views Hardware» more  ASPDAC 2010»
13 years 3 months ago
Hybrid dynamic energy and thermal management in heterogeneous embedded multiprocessor SoCs
Heterogeneous multiprocessor system-on-chips (MPSoCs) which consist of cores with various power and performance characteristics can customize their configuration to achieve higher ...
Shervin Sharifi, Ayse Kivilcim Coskun, Tajana Simu...
ML
2006
ACM
13 years 5 months ago
Universal parameter optimisation in games based on SPSA
Most game programs have a large number of parameters that are crucial for their performance. While tuning these parameters by hand is rather difficult, efficient and easy to use ge...
Levente Kocsis, Csaba Szepesvári
IWLCS
2005
Springer
13 years 10 months ago
Counter Example for Q-Bucket-Brigade Under Prediction Problem
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
Atsushi Wada, Keiki Takadama, Katsunori Shimohara