We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
Nowadays, searches for the web pages of a person with a given name constitute a notable fraction of queries to Web search engines. Such a query would normally return web pages rela...
Dmitri V. Kalashnikov, Zhaoqi Chen, Sharad Mehrotr...
Abstract. The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrapp...
As opposed to representing a document as a "bag of words" in most information retrieval applications, we propose a model of representing a web page as sets of named enti...
Nan Di, Conglei Yao, Mengcheng Duan, Jonathan J. H...
In this chapter, we characterize problems for web applications, examine existing testing techniques that are potentially applicable to the web environment, and introduce a strateg...