In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
Sharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer suffici...
Rabab Hayek, Guillaume Raschia, Patrick Valduriez,...
Nowadays privacy becomes a major concern and many research efforts have been dedicated to the development of privacy protecting technology. Anonymization techniques provide an eff...
In order to support the navigation in huge document collections efficiently, tagged hierarchical structures can be used. Often, multiple tags are used to describe resources. For u...