Sciweavers

1632 search results - page 292 / 327
» Sublinear Time Algorithms for Metric Space Problems
Sort
View
139
Voted
PVLDB
2008
127views more  PVLDB 2008»
15 years 2 months ago
Discovering data quality rules
Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often ari...
Fei Chiang, Renée J. Miller
136
Voted
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
16 years 4 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
125
Voted
JIB
2007
153views more  JIB 2007»
15 years 3 months ago
Duplicate detection of 2D-NMR Spectra
2D-Nuclear magnetic resonance (NMR) spectra are used in the (structural) analysis of small molecules. In contrast to 1D-NMR spectra, 2D-NMR spectra correlate the chemical shifts o...
Alexander Hinneburg, Björn Egert, Andrea Porz...
155
Voted
BIBE
2008
IEEE
142views Bioinformatics» more  BIBE 2008»
15 years 10 months ago
Optimizing performance, cost, and sensitivity in pairwise sequence search on a cluster of PlayStations
— The Smith-Waterman algorithm is a dynamic programming method for determining optimal local alignments between nucleotide or protein sequences. However, it suffers from quadrati...
Ashwin M. Aji, Wu-chun Feng
SIGMOD
1997
ACM
134views Database» more  SIGMOD 1997»
15 years 7 months ago
Scalable Parallel Data Mining for Association Rules
One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. The most time consu...
Eui-Hong Han, George Karypis, Vipin Kumar