Sciweavers

ICML
2002
IEEE
14 years 5 months ago
An Alternate Objective Function for Markovian Fields
Sham Kakade, Yee Whye Teh, Sam T. Roweis
ICML
2002
IEEE
14 years 5 months ago
Discovering Hierarchy in Reinforcement Learning with HEXQ
An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP h...
Bernhard Hengst
ICML
2002
IEEE
14 years 5 months ago
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs
One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement ...
Carlos Guestrin, Relu Patrascu, Dale Schuurmans
ICML
2002
IEEE
14 years 5 months ago
Coordinated Reinforcement Learning
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value fu...
Carlos Guestrin, Michail G. Lagoudakis, Ronald Par...
ICML
2002
IEEE
14 years 5 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan
ICML
2002
IEEE
14 years 5 months ago
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
Supervised learning techniques for text classi cation often require a large number of labeled examples to learn accurately. One way to reduce the amountoflabeled datarequired is t...
Rayid Ghani
ICML
2002
IEEE
14 years 5 months ago
Multi-Instance Kernels
Learning from structured data is becoming increasingly important. However, most prior work on kernel methods has focused on learning from attribute-value data. Only recently, rese...
Adam Kowalczyk, Alex J. Smola, Peter A. Flach, Tho...
ICML
2002
IEEE
14 years 5 months ago
On generalization bounds, projection profile, and margin distribution
We study generalization properties of linear learning algorithms and develop a data dependent approach that is used to derive generalization bounds that depend on the margin distr...
Ashutosh Garg, Sariel Har-Peled, Dan Roth
ICML
2002
IEEE
14 years 5 months ago
Univariate Polynomial Inference by Monte Carlo Message Length Approximation
We apply the Message from Monte Carlo (MMC) algorithm to inference of univariate polynomials. MMC is an algorithm for point estimation from a Bayesian posterior sample. It partiti...
Leigh J. Fitzgibbon, David L. Dowe, Lloyd Allison