Statistical Schema Matching across Web Query Interfaces

12 years 6 months ago
Statistical Schema Matching across Web Query Interfaces
Schema matching is a critical problem for integrating heterogeneous information sources. Traditionally, the problem of matching multiple schemas has essentially relied on finding pairwise-attribute correspondence. This paper proposes a different approach, motivated by integrating large numbers of data sources on the Internet. On this "deep Web," we observe two distinguishing characteristics that offer a new view for considering schema matching: First, as the Web scales, there are ample sources that provide structured information in the same domains (e.g., books and automobiles). Second, while sources proliferate, their aggregate schema vocabulary tends to converge at a relatively small size. Motivated by these observations, we propose a new paradigm, statistical schema matching: Unlike traditional approaches using pairwise-attribute correspondence, we take a holistic approach to match all input schemas by finding an underlying generative schema model. We propose a general st...
Bin He, Kevin Chen-Chuan Chang
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2003
Authors Bin He, Kevin Chen-Chuan Chang
Comments (0)