Sciweavers

1437 search results - page 102 / 288
» Content Extraction Signatures
Sort
View
WSDM
2012
ACM
214views Data Mining» more  WSDM 2012»
13 years 10 months ago
Selecting actions for resource-bounded information extraction using reinforcement learning
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to d...
Pallika H. Kanani, Andrew K. McCallum
ICMCS
2007
IEEE
203views Multimedia» more  ICMCS 2007»
15 years 9 months ago
Attacking Some Perceptual Image Hash Algorithms
Perceptual hashing is an emerging solution for multimedia content authentication. Due to their robustness, such techniques might not work well when malicious attack is perceptuall...
Li Weng, Bart Preneel
ICDAR
2009
IEEE
15 years 10 months ago
Scalable Feature Extraction from Noisy Documents
We cope with the metadata recognition in layoutoriented documents. We address the problem as a classification task and propose a method for automatic extraction of relevant featu...
Loïc Lecerf, Boris Chidlovskii
CIKM
2007
Springer
15 years 9 months ago
Comments-oriented blog summarization by sentence extraction
Much existing research on blogs focused on posts only, ignoring their comments. Our user study conducted on summarizing blog posts, however, showed that reading comments does chan...
Meishan Hu, Aixin Sun, Ee-Peng Lim
SOFSEM
2007
Springer
15 years 9 months ago
Creating Permanent Test Collections of Web Pages for Information Extraction Research
In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...
Bernhard Pollak, Wolfgang Gatterbauer