Sciweavers

4651 search results - page 680 / 931
» A Data Quality Browser
Sort
View
AIR
2004
113views more  AIR 2004»
15 years 17 days ago
Class Noise vs. Attribute Noise: A Quantitative Study
Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created from the data and decisions made based on...
Xingquan Zhu, Xindong Wu
113
Voted
BMCBI
2004
119views more  BMCBI 2004»
15 years 16 days ago
Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
Background: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronle...
Meena K. Sakharkar, Pandjassarame Kangueane
CANDC
2004
ACM
15 years 16 days ago
iProLINK: an integrated protein resource for literature mining
The exponential growth of large-scale molecular sequence data and of the PubMed scientific literature has prompted active research in biological literature mining and information ...
Zhang-Zhi Hu, Inderjeet Mani, Vincent Hermoso, Hon...
81
Voted
PVLDB
2008
99views more  PVLDB 2008»
15 years 4 days ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
150
Voted
SOCO
2010
Springer
14 years 11 months ago
Automatic detection of trends in time-stamped sequences: an evolutionary approach
This paper presents an evolutionary algorithm for modeling the arrival dates in time-stamped data sequences such as newscasts, e-mails, IRC conversations, scientific journal artic...
Lourdes Araujo, Juan Julián Merelo Guerv&oa...