Sciweavers

583 search results - page 35 / 117
» Automatic extraction of titles from general documents using ...
Sort
View
WWW
2005
ACM
15 years 10 months ago
Web data extraction based on partial tree alignment
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Yanhong Zhai, Bing Liu
ECCV
2008
Springer
15 years 11 months ago
Signature-Based Document Image Retrieval
As the most pervasive method of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effectiv...
Guangyu Zhu, Yefeng Zheng, David S. Doermann
ICADL
2010
Springer
160views Education» more  ICADL 2010»
15 years 2 months ago
Thesaurus Extension Using Web Search Engines
Maintaining and extending large thesauri is an important challenge facing digital libraries and IT businesses alike. In this paper we describe a method building on and extending ex...
Robert Meusel, Mathias Niepert, Kai Eckert, Heiner...
ESWS
2004
Springer
15 years 2 months ago
Learning to Harvest Information for the Semantic Web
Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
61
Voted
SOUPS
2009
ACM
15 years 4 months ago
Machine learning attacks against the Asirra CAPTCHA
The ASIRRA CAPTCHA [6], recently proposed at ACM CCS 2007, relies on the problem of distinguishing images of cats and dogs (a task that humans are very good at). The security of AS...
Philippe Golle