Improving data quality is a time-consuming, labor-intensive and often domain specific operation. Existing data repair approaches are either fully automated or not efficient in int...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...
The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This paper explores an application in which a second...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
In this paper we present GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...
Machine-learned ranking techniques automatically learn a complex document ranking function given training data. These techniques have demonstrated the effectiveness and flexibilit...
Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, ...