The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...
Machine learning approaches to indoor WiFi localization involve an offline phase and an online phase. In the offline phase, data are collected from an environment to build a local...
Sinno Jialin Pan, Dou Shen, Qiang Yang, James T. K...
When the goal is to achieve the best correct classification rate, cross entropy and mean squared error are typical cost functions used to optimize classifier performance. However,...
Lian Yan, Robert H. Dodier, Michael Mozer, Richard...
The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made amon...
This paper addresses the problem of classifying observations when features are context-sensitive, specifically when the testing set involves a context that is different from the t...