Sciweavers

KDD
2003
ACM

Mining data records in Web pages

14 years 5 months ago
Mining data records in Web pages
A large amount of information on the Web is contained in regularly structured objects, which we call data records. Such data records are important because they often present the essential information of their host pages, e.g., lists of products or services. It is useful to mine such data records in order to extract information from them to provide value-added services. Existing automatic techniques are not satisfactory because of their poor accuracies. In this paper, we propose a more effective technique to perform the task. The technique is based on two observations about data records on the Web and a string matching algorithm. The proposed technique is able to mine both contiguous and noncontiguous data records. Our experimental results show that the proposed technique outperforms existing techniques substantially. Categories and Subject Descriptors I.5 [Pattern Recognition]: statistical and structural H.2.8 [Database Applications]: data mining Keywords Web data records, Web mining,...
Bing Liu, Robert L. Grossman, Yanhong Zhai
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2003
Where KDD
Authors Bing Liu, Robert L. Grossman, Yanhong Zhai
Comments (0)