CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees |
| |
Authors: | Joachim Hammer Lixin Fu |
| |
Affiliation: | (1) Computer & Information Science & Eng, University of Florida, Gainesville, FL 32611-6120, USA;(2) Division of Computer Science, University of North Carolina, Greensboro, Greensboro, NC 27402-6170, USA |
| |
Abstract: | We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select andmaterialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems. |
| |
Keywords: | data cube data warehouse multi-dimensional OLAP query processing statistics tree |
本文献已被 SpringerLink 等数据库收录! |
|