Learning programs from noisy data

9 months 19 days ago
Learning programs from noisy data
We present a new approach for learning programs from noisy datasets. Our approach is based on two new concepts: a regularized program generator which produces a candidate program based on a small sample of the entire dataset while avoiding overfitting, and a dataset sampler which carefully samples the dataset by leveraging the candidate program’s score on that dataset. The two components are connected in a continuous feedback-directed loop. We show how to apply this approach to two settings: one where the dataset has a bound on the noise, and another without a noise bound. The second setting leads to a new way of performing approximate empirical risk minimization on hypotheses classes formed by a discrete search space. We then present two new kinds of program synthesizers which target the two noise settings. First, we introduce a novel regularized bitstream synthesizer that successfully generates programs even in the presence of incorrect examples. We show that the synthesizer can ...
Veselin Raychev, Pavol Bielik, Martin T. Vechev, A
Added 09 Apr 2016
Updated 09 Apr 2016
Type Journal
Year 2016
Where POPL
Authors Veselin Raychev, Pavol Bielik, Martin T. Vechev, Andreas Krause
Comments (0)