Sciweavers

260 search results - page 48 / 52
» Industry-scale duplicate detection
Sort
View
SSDBM
2010
IEEE
220views Database» more  SSDBM 2010»
15 years 1 months ago
Prefix Tree Indexing for Similarity Search and Similarity Joins on Genomic Data
Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Astrid Rheinländer, Martin Knobloch, Nicky Ho...
DEXA
2006
Springer
197views Database» more  DEXA 2006»
14 years 11 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife
BMCBI
2005
104views more  BMCBI 2005»
14 years 9 months ago
A statistical approach for array CGH data analysis
Background: Microarray-CGH experiments are used to detect and map chromosomal imbalances, by hybridizing targets of genomic DNA from a test and a reference sample to sequences imm...
Franck Picard, Stéphane Robin, Marc Laviell...
ICDE
2007
IEEE
127views Database» more  ICDE 2007»
15 years 11 months ago
Supporting Streaming Updates in an Active Data Warehouse
Active Data Warehousing has emerged as an alternative to conventional warehousing practices in order to meet the high demand of applications for up-to-date information. In a nutsh...
Neoklis Polyzotis, Spiros Skiadopoulos, Panos Vass...
RECOMB
2004
Springer
15 years 10 months ago
The Statistical Significance of Max-Gap Clusters
Identifying gene clusters, genomic regions that share local similarities in gene organization, is a prerequisite for many different types of genomic analyses, including operon pred...
Rose Hoberman, David Sankoff, Dannie Durand