We cast name discrimination as a problem in clustering short contexts. Each occurrence of an ambiguous name is treated independently, and represented using second?order context vec...
Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been li...
Kuang Chen, Harr Chen, Neil Conway, Joseph M. Hell...
Bagging is an ensemble method that uses random resampling of a dataset to construct models. In classification scenarios, the random resampling procedure in bagging induces some c...
We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into...
—Particle filter is a powerful visual tracking tool based on sequential Monte Carlo framework, and it needs large numbers of samples to properly approximate the posterior density...
Guangyu Zhu, Dawei Liang, Yang Liu, Qingming Huang...