We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
When harvesting collective intelligence, a user wishes to maximize the accuracy and value of the acquired information without spending too much time collecting it. We empirically ...