Sciweavers

1061 search results - page 115 / 213
» Massive Data Pre-Processing with a Cluster Based Approach
Sort
View
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
16 years 4 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
16 years 4 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
EDBT
2008
ACM
102views Database» more  EDBT 2008»
16 years 4 months ago
Summary management in P2P systems
Sharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer suffici...
Rabab Hayek, Guillaume Raschia, Patrick Valduriez,...
ADC
2010
Springer
204views Database» more  ADC 2010»
14 years 11 months ago
Systematic clustering method for l-diversity model
Nowadays privacy becomes a major concern and many research efforts have been dedicated to the development of privacy protecting technology. Anonymization techniques provide an eff...
Md. Enamul Kabir, Hua Wang, Elisa Bertino, Yunxian...
LWA
2008
15 years 5 months ago
Labeling Clusters - Tagging Resources
In order to support the navigation in huge document collections efficiently, tagged hierarchical structures can be used. Often, multiple tags are used to describe resources. For u...
Korinna Bade, Andreas Nürnberger