Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
In this paper we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous data objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlinks...
Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, ...
There exist many interrelated information sources on the Internet that can be categorized into structured (database) and semistructured (documents). A key challenge is to integrat...
This paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using al...
Extracting information from web pages is an important problem; it has several applications such as providing improved search results and construction of databases to serve user qu...
Paramveer S. Dhillon, Sundararajan Sellamanickam, ...