In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
We study how to find plans that maximize the expected total utility for a given MDP, a planning objective that is important for decision making in high-stakes domains. The optimal...
Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods ...
Carlos Guestrin, Milos Hauskrecht, Branislav Kveto...
We propose new types of linguistic summaries of time-series data that extend those proposed in our previous papers. The proposed summaries of time series refer to the summaries of...
An alphabetic binary tree formulation applies to problems in which an outcome needs to be determined via alphabetically ordered search prior to the termination of some window of o...