Sciweavers

53 search results - page 10 / 11
» Expectation Propagation for approximate Bayesian inference
Sort
View
AAAI
1998
13 years 6 months ago
Bayesian Q-Learning
A central problem in learning in complex environmentsis balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of explora...
Richard Dearden, Nir Friedman, Stuart J. Russell
ICML
2004
IEEE
14 years 5 months ago
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data
In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longra...
Charles A. Sutton, Khashayar Rohanimanesh, Andrew ...
JMLR
2010
125views more  JMLR 2010»
12 years 11 months ago
Variational methods for Reinforcement Learning
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
Thomas Furmston, David Barber
ALT
2004
Springer
14 years 1 months ago
Relative Loss Bounds and Polynomial-Time Predictions for the k-lms-net Algorithm
We consider a two-layer network algorithm. The first layer consists of an uncountable number of linear units. Each linear unit is an LMS algorithm whose inputs are first “kerne...
Mark Herbster
WSC
2007
13 years 7 months ago
New greedy myopic and existing asymptotic sequential selection procedures: preliminary empirical results
Statistical selection procedures can identify the best of a finite set of alternatives, where “best” is defined in terms of the unknown expected value of each alternative’...
Stephen E. Chick, Jürgen Branke, Christian Sc...