Currently most link-related applications treat all links in the same web page to be identical. One link-related application usually requires one certain property of hyperlinks but...
Mingliang Zhu, Weiming Hu, Ou Wu, Xi Li, Xiaoqin Z...
Following the exponential growth of social media, there now exist huge repositories of videos online. Among the huge volumes of videos, there exist large numbers of near-duplicate...
Hierarchical models are commonly used to organize a Website's content. A Website's content structure can be represented by a topic hierarchy, a directed tree rooted at a...
Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyon...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...