Sciweavers

563 search results - page 32 / 113
» Crawling the web for structured documents
Sort
View
TREC
2004
15 years 3 months ago
Language Models for Searching in Web Corpora
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
Jaap Kamps, Gilad Mishne, Maarten de Rijke
KDD
2007
ACM
231views Data Mining» more  KDD 2007»
16 years 2 months ago
Xproj: a framework for projected structural clustering of xml documents
XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encod...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F...
CIKM
2007
Springer
15 years 8 months ago
Effective top-k computation in retrieving structured documents with term-proximity support
Modern web search engines are expected to return top-k results efficiently given a query. Although many dynamic index pruning strategies have been proposed for efficient top-k com...
Mingjie Zhu, Shuming Shi, Mingjing Li, Ji-Rong Wen
VRML
1995
ACM
15 years 5 months ago
Visualizing the Structure of the World Wide Web in 3D Hyperbolic Space
We visualize the structure of sections of the World Wide Web by constructing graphical representations in 3D hyperbolic space. The felicitous property that hyperbolic space has â€...
Tamara Munzner, Paul Burchard
PREMI
2011
Springer
14 years 4 months ago
Finding Potential Seeds through Rank Aggregation of Web Searches
This paper presents a potential seed selection algorithm for web crawlers using a gain - share scoring approach. Initially we consider a set of arbitrarily chosen tourism queries. ...
Rajendra Prasath, Pinar Öztürk