Sciweavers

829 search results - page 16 / 166
» A time aggregation approach to Markov decision processes
Sort
View
109
Voted
UAI
2000
15 years 3 months ago
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Andrew Y. Ng, Michael I. Jordan
ENTCS
2008
91views more  ENTCS 2008»
15 years 1 months ago
Expressing Priorities and External Probabilities in Process Algebra via Mixed Open/Closed Systems
Defining operational semantics for a process algebra is often based either on labeled transition systems that account for interaction with a context or on the so-called reduction ...
Mario Bravetti
ISSS
1999
IEEE
121views Hardware» more  ISSS 1999»
15 years 6 months ago
Event-Driven Power Management of Portable Systems
The policy optimization problem for dynamic power management has received considerable attention in the recent past. We formulate policy optimization as a constrained optimization...
Tajana Simunic, Giovanni De Micheli, Luca Benini
128
Voted
ICANN
2001
Springer
15 years 6 months ago
Market-Based Reinforcement Learning in Partially Observable Worlds
Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an ...
Ivo Kwee, Marcus Hutter, Jürgen Schmidhuber
SIAMSC
2008
148views more  SIAMSC 2008»
15 years 1 months ago
Multilevel Adaptive Aggregation for Markov Chains, with Application to Web Ranking
A multilevel adaptive aggregation method for calculating the stationary probability vector of an irreducible stochastic matrix is described. The method is a special case of the ada...
Hans De Sterck, Thomas A. Manteuffel, Stephen F. M...