The problem of finding an optimal bipartition of a rectangle set has a direct impact on query performance of dynamic R-trees. During update operations, overflowed nodes need to be...
This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in ...
Text data in the Internet can be partitioned into many databases naturally. Efficient retrieval of desired data can be achieved if we can accurately predict the usefulness of each...
Weiyi Meng, King-Lup Liu, Clement T. Yu, Xiaodong ...
A broad spectrum of data is available on the Web in distinct heterogeneous sources, and stored under different formats. As the number of systems that utilize this heterogeneous da...
We consider the problem of finding association rules that make nearly optimal binary segmentations of huge categorical databases. The optimality of segmentation is defined by an o...