The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant infor...
Davi de Castro Reis, Paulo Braz Golgher, Altigran ...
This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods stil...
This paper describes a method of extracting katakana words and phrases, along with their English counterparts from non-aligned monolingual web search engine query logs. The method...
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...