Sciweavers

563 search results - page 10 / 113
» Crawling the web for structured documents
Sort
View
ADCS
2004
15 years 3 months ago
Focused Crawling in Depression Portal Search: A Feasibility Study
Previous work on domain specific search services in the area of depressive illness has documented the significant human cost required to setup and maintain closed-crawl parameters....
Thanh Tin Tang, David Hawking, Nick Craswell, Rame...
ECAI
2008
Springer
15 years 3 months ago
Reinforcement Learning with Classifier Selection for Focused Crawling
Focused crawlers are programs that wander in the Web, using its graph structure, and gather pages that belong to a specific topic. The most critical task in Focused Crawling is the...
Ioannis Partalas, Georgios Paliouras, Ioannis P. V...
ECCV
2008
Springer
16 years 3 months ago
Learning Visual Shape Lexicon for Document Image Content Recognition
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
MAICS
2004
15 years 3 months ago
Creation of a Style Independent Intelligent Autonomous Citation Indexer to Support Academic Research
This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. B...
Eric G. Berkowitz, Mohamed Reda Elkhadiri
154
Voted
WWW
2011
ACM
14 years 8 months ago
Inverted index compression via online document routing
Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled. Due...
Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh