As data have been growing rapidly in data centers, deduplication storage systems continuously face challenges in providing the corresponding throughputs and capacities necessary t...
Wei Dong, Fred Douglis, Kai Li, R. Hugo Patterson,...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
Abstract. Clustering is a problem of great practical importance in numerous applications. The problem of clustering becomes more challenging when the data is categorical, that is, ...
In this paper, we investigate the suitability of clustered architectures for designing scalable multimedia servers. Specifically, we evaluate the effects of: (i) architectural des...
Renu Tewari, Rajat Mukherjee, Daniel M. Dias, Harr...
K-anonymisation is an approach to protecting privacy contained within a dataset. A good k-anonymisation algorithm should anonymise a dataset in such a way that private information...