Sciweavers

SIGMOD
2008
ACM

Discovering topical structures of databases

14 years 4 months ago
Discovering topical structures of databases
The increasing complexity of enterprise databases and the prevalent lack of documentation incur significant cost in both understanding and integrating the databases. Existing solutions addressed mining for keys and foreign keys, but paid little attention to more high-level structures of databases. In this paper, we consider the problem of discovering topical structures of databases to support semantic browsing and large-scale data integration. We describe iDisc, a novel discovery system based on a multi-strategy learning framework. iDisc exploits varied evidence in database schema and instance values to construct multiple kinds of database representations. It employs a set of base clusterers to discover preliminary topical clusters of tables from database representations, and then aggregate them into final clusters via meta-clustering. To further improve the accuracy, we extend iDisc with novel multiple-level aggregation and clusterer boosting techniques. We introduce a new measure on...
Wensheng Wu, Berthold Reinwald, Yannis Sismanis, R
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2008
Where SIGMOD
Authors Wensheng Wu, Berthold Reinwald, Yannis Sismanis, Rajesh Manjrekar
Comments (0)