Sciweavers

1061 search results - page 186 / 213
» Massive Data Pre-Processing with a Cluster Based Approach
Sort
View
ICDM
2005
IEEE
185views Data Mining» more  ICDM 2005»
15 years 7 months ago
Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping
The problem of record linkage focuses on determining whether two object descriptions refer to the same underlying entity. Addressing this problem effectively has many practical ap...
Mikhail Bilenko, Sugato Basu, Mehran Sahami
132
Voted
BMCBI
2006
143views more  BMCBI 2006»
15 years 1 months ago
IsoSVM - Distinguishing isoforms and paralogs on the protein level
Background: Recent progress in cDNA and EST sequencing is yielding a deluge of sequence data. Like database search results and proteome databases, this data gives rise to inferred...
Michael Spitzer, Stefan Lorkowski, Paul Cullen, Al...
217
Voted
SIGMOD
2009
ACM
177views Database» more  SIGMOD 2009»
16 years 2 months ago
Exploiting context analysis for combining multiple entity resolution systems
Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descript...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
SIGIR
2006
ACM
15 years 7 months ago
User modeling for full-text federated search in peer-to-peer networks
User modeling for information retrieval has mostly been studied to improve the effectiveness of information access in centralized repositories. In this paper we explore user model...
Jie Lu, James P. Callan
KDD
2009
ACM
189views Data Mining» more  KDD 2009»
15 years 8 months ago
CoCo: coding cost for parameter-free outlier detection
How can we automatically spot all outstanding observations in a data set? This question arises in a large variety of applications, e.g. in economy, biology and medicine. Existing ...
Christian Böhm, Katrin Haegler, Nikola S. M&u...