Sciweavers

770 search results - page 47 / 154
» Large Scale Analysis of Search Engine Content
Sort
View
WWW
2010
ACM
15 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
CVPR
1997
IEEE
15 years 8 months ago
FOCUS: Searching for Multi-colored Objects in a Diverse Image Database
We describe a new multi-phase, color-based image retrieval system, FOCUS Fast Object Color-based qUery System, with an online user interface which is capable of identifying mult...
Madirakshi Das, Edward M. Riseman, Bruce A. Draper
CIKM
2003
Springer
15 years 9 months ago
Automated index management for distributed web search
Distributed heterogeneous search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers d...
Rinat Khoussainov, Nicholas Kushmerick
WWW
2007
ACM
16 years 5 months ago
A large-scale study of robots.txt
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
Yang Sun, Ziming Zhuang, C. Lee Giles
CIKM
2010
Springer
15 years 2 months ago
Web search solved?: all result rankings the same?
The objective of this work is to derive quantitative statements about what fraction of web search queries issued to the state-of-the-art commercial search engines lead to excellen...
Hugo Zaragoza, Berkant Barla Cambazoglu, Ricardo A...