Sciweavers

682 search results - page 84 / 137
» One-Counter Markov Decision Processes
Sort
View
106
Voted
GLOBECOM
2008
IEEE
15 years 7 months ago
Foresighted Resource Reciprocation Strategies in P2P Networks
—We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing content. While sharing resources, autonomous and self-interested peers need to make decis...
Hyunggon Park, Mihaela van der Schaar
AIPS
2000
15 years 2 months ago
On-line Scheduling via Sampling
1 We consider the problem of scheduling an unknown sequence of tasks for a single server as the tasks arrive with the goal off maximizing the total weighted value of the tasks serv...
Hyeong Soo Chang, Robert Givan, Edwin K. P. Chong
113
Voted
CORR
2006
Springer
113views Education» more  CORR 2006»
15 years 25 days ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
120
Voted
DSN
2009
IEEE
14 years 10 months ago
RRE: A game-theoretic intrusion Response and Recovery Engine
Preserving the availability and integrity of networked computing systems in the face of fast-spreading intrusions requires advances not only in detection algorithms, but also in a...
Saman A. Zonouz, Himanshu Khurana, William H. Sand...
136
Voted
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
14 years 10 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel