Sciweavers

4124 search results - page 308 / 825
» Improved Algorithms for Data Migration
Sort
View
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
16 years 5 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
VLDB
2005
ACM
185views Database» more  VLDB 2005»
16 years 5 months ago
Storing and querying XML data using denormalized relational databases
XML database systems emerge as a result of the acceptance of the XML data model. Recent works have followed the promising approach of building XML database management systems on un...
Andrey Balmin, Yannis Papakonstantinou
ACMMSP
2005
ACM
101views Hardware» more  ACMMSP 2005»
15 years 10 months ago
Transparent pointer compression for linked data structures
64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and m...
Chris Lattner, Vikram S. Adve
KDD
2005
ACM
125views Data Mining» more  KDD 2005»
16 years 5 months ago
Email data cleaning
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Jie Tang, Hang Li, Yunbo Cao, ZhaoHui Tang
IJCNN
2008
IEEE
15 years 11 months ago
Two-level clustering approach to training data instance selection: A case study for the steel industry
— Nowadays, huge amounts of information from different industrial processes are stored into databases and companies can improve their production efficiency by mining some new kn...
Heli Koskimäki, Ilmari Juutilainen, Perttu La...