Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Harmonic monitoring has become an important tool for harmonic management in distribution systems. A comprehensive harmonic monitoring program has been designed and implemented on ...
Compared with traditional association rule mining in the structured world (e.g. Relational Databases), mining from XML data is confronted with more challenges due to the inherent ...
Rahman Ali Mohammadzadeh, Sadegh Soltan, Masoud Ra...
The concepts and architecture underlying a large-scale integrating project funded within the 7th EU Framework Programme (FP7) are discussed. The main objective of the project is to...