Efficient Computation of Iceberg Cubes with Complex Measures
Document Type
Conference Proceeding
Publication Date
6-2001
Abstract
It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for OLAP and data mining.
In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the average measure, we propose a top-k average pruning method and extend two previously studied methods, Apriori and BUC, to Top-k Apriori and Top-k BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC and Top-k H-Cubing are two promising candidates for scalable computation, and Top-k H-Cubing has better performance in most cases.
Repository Citation
Han, J.,
Pei, J.,
Dong, G.,
& Wang, K.
(2001). Efficient Computation of Iceberg Cubes with Complex Measures. ACM SIGMOD Record, 30 (2), 1-12.
https://corescholar.libraries.wright.edu/knoesis/419
DOI
10.1145/376284.375664
Comments
Article also presented at the ACM SIGMOD International Conference on Management of Data, Santa Barbara, CA, May 21-24, 2001.