This paper studies structured data extraction from Web pages, e.g., online product description pages. Existing approaches to data extraction include wrapper induction and automatic...
Surveys are an important part of marketing and customer relationship management, and open answers (i.e., answers to open questions) in particular may contain valuable information ...
Techniques for learning from data typically require data to be in standard form. Measurements must be encoded in a numerical format such as binary true-or-false features, numerica...
V. Seshadri, Raguram Sasisekharan, Sholom M. Weiss
We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts sur...
Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. However, the primary approach (supervis...