We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing ...
One of the fundamental problems in Content-Based Image Retrieval (CBIR) has been the gap between low-level visual features and high-level semantic concepts. To narrow down this gap...
Face recognition is a challenging task in computer vision and pattern recognition. It is well-known that obtaining a low-dimensional feature representation with enhanced discrimin...
Fei Wang, Jingdong Wang, Changshui Zhang, James T....
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...
Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...