Data cubes provide a powerful data analysis tool called the range-sum query. The range-sum query is very popular and becomes important in finding trends and in discovering relationships between attributes in diverse database applications. It sums over the selected cells of an OLAP data cube where target cells are decided by the specified query ranges. The direct method to access the data cube itself forces too many cells to be accessed, therefore it incurs a severe overhead. The response time is very crucial for OLAP applications which need interactions with users. In the recent dynamic enterprise environment, data elements in the cube are frequently changed. The response time is affected in such an environment by the update cost as well as the search cost of the cube. Existing techniques for range-sum queries on data cubes use an additional cube called the prefix sum cube (PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices.
In this thesis, we first propose an efficient algorithm to reduce the update cost significantly while maintaining reasonable search efficiency, by using an index structure called the d-tree. In addition, we propose a hybrid method to provide either an approximate result or a precise one to reduce the overall cost of queries. It is useful for various applications that need a quick approximate answer rather than an accurate one, such as decision support systems.
Next, we present a new cube representation called `the PC Pool`` which drastically reduces the space of the PC in a large data warehouse. The PC Pool decreases the update propagation caused by the dependency between values in cells of the PC. We develop an effective algorithm which finds dense sub-cubes from a large data cube. We perform an extensive experiment with diverse data sets, and examine the...