Optimizing the CVaR via Sampling

10 years 4 months ago

Download tx.technion.ac.il

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risksensitive controller for the game of Tetris.

Aviv Tamar, Yonatan Glassner, Shie Mannor

Real-time Traffic

AAAI 2015 | Computer Science |

claim paper

Post Info
More Details (n/a)

Added	27 Mar 2016
Updated	27 Mar 2016
Type	Journal
Year	2015
Where	AAAI
Authors	Aviv Tamar, Yonatan Glassner, Shie Mannor

Comments (0)

Sciweavers

Optimizing the CVaR via Sampling

AAAI 2015 | Computer Science |

Explore & Download

Productivity Tools

Sciweavers