One of the Web information Retrieval (IR) problems these days is to identify redundant information that exist in (replicated) Web documents. These documents can easily be found in...
Abstract. We present the results of UMBC’s participation in the Web and Novelty tracks. We explored various heuristics-based link analysis approaches to the Topic Distillation ta...
Srikanth Kallurkar, Yongmei Shi, R. Scott Cost, Ch...
Background: The indexing of scientific literature and content is a relevant and contemporary requirement within life science information systems. Navigating information available ...
Christopher J. O. Baker, Kanagasabai Rajaraman, We...
—Current keyword search by Google, Yahoo, and so on gives enormous unsuitable results. A solution to this perhaps is to annotate semantics to textual web data to enable semantic ...
This paper presents a potential seed selection algorithm for web crawlers using a gain - share scoring approach. Initially we consider a set of arbitrarily chosen tourism queries. ...