Sciweavers

164 search results - page 3 / 33
» Using Clustering and Edit Distance Techniques for Automatic ...
Sort
View
DASFAA
2005
IEEE
123views Database» more  DASFAA 2005»
13 years 7 months ago
Automatic Data Extraction from Data-Rich Web Pages
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
Dongdong Hu, Xiaofeng Meng
VLDB
2001
ACM
144views Database» more  VLDB 2001»
13 years 10 months ago
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extracti...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
WWW
2009
ACM
13 years 10 months ago
Extracting data records from the web using tag path clustering
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Gengxin Miao, Jun'ichi Tatemura, Wang-Pin Hsiung, ...
WSE
2003
IEEE
13 years 11 months ago
Using Keyword Extraction for Web Site Clustering
Reverse engineering techniques have the potential to support Web site understanding, by providing views that show the organization of a site and its navigational structure. Howeve...
Paolo Tonella, Filippo Ricca, Emanuele Pianta, Chr...
IAT
2006
IEEE
13 years 11 months ago
Semantic Labeling of Data by Using the Web
The Web consists of a large amount of unstructured information that hardly can be elaborated by automatic agents. In recent years, a considerable number of techniques for informat...
Leonardo Rigutini, Ernesto Di Iorio, Marco Ernande...