Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
The deep Web presents a pressing need for integrating large numbers of dynamically evolving data sources. To be more automatic yet accurate in building an integration system, we o...
Shui-Lung Chuang, Kevin Chen-Chuan Chang, ChengXia...
The training experiences needed by a learning system may be selected by either an external agent or the system itself. We show that knowledge of the current state of the learner...
Background: We propose a statistically principled baseline correction method, derived from a parametric smoothing model. It uses a score function to describe the key features of b...
Selecting informative genes from microarray experiments is one of the most important data analysis steps for deciphering biological information imbedded in such experiments. Howev...