Sciweavers

3694 search results - page 220 / 739
» Stochastic complexity in learning
Sort
View
ICML
2004
IEEE
16 years 5 months ago
Utile distinction hidden Markov models
This paper addresses the problem of constructing good action selection policies for agents acting in partially observable environments, a class of problems generally known as Part...
Daan Wierstra, Marco Wiering
NIPS
2003
15 years 5 months ago
Approximate Policy Iteration with a Policy Language Bias
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
Alan Fern, Sung Wook Yoon, Robert Givan
SIGPRO
2008
151views more  SIGPRO 2008»
15 years 4 months ago
An adaptive penalized maximum likelihood algorithm
The LMS algorithm is one of the most popular learning algorithms for identifying an unknown system. Many variants of the algorithm have been developed based on different problem f...
Guang Deng, Wai-Yin Ng
COLT
2010
Springer
15 years 2 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
JAIR
2011
144views more  JAIR 2011»
14 years 11 months ago
Non-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making proble...
Mahdi Milani Fard, Joelle Pineau