Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Web search engines are traditionally evaluated in terms of the relevance of web pages to individual queries. However, relevance of web pages does not tell the complete picture, si...
We study and document an important development in how attackers are using Internet resources: the creation of malicious DNS resolution paths. In this growing form of attack, victi...
David Dagon, Niels Provos, Christopher P. Lee, Wen...
This paper describes our first participation in the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF1 competition. In this track, the task is to...
We propose a new text mining system which extracts characteristic contents from given documents. We define Key semantics as characteristic sub-structures of syntactic dependencie...