Loading...
Bulk Operations for Space-Partitioning Trees
Ghanem, Thanaa M. ; Shah, Rahul ; Mokbel, Mohamed F. ; Aref, Walid ; Vitter, Jeffrey Scott
Ghanem, Thanaa M.
Shah, Rahul
Mokbel, Mohamed F.
Aref, Walid
Vitter, Jeffrey Scott
Citations
Altmetric:
Abstract
The emergence of extensible index structures, e.g.,
GiST (Generalized Search Tree) [25] and SP-GiST
(Space-Partitioning Generalized Search Tree) [3], calls
for a set of extensible algorithms to support different
operations (e.g., insertion, deletion, and search). Extensible
bulk operations (e.g., bulk loading and bulk insertion)
are of the same importance and need to be supported
in these index engines. In this paper, we propose
two extensible buffer-based algorithms for bulk operations
in the class of space-partitioning trees; a class of
hierarchical data structures that recursively decompose
the space into disjoint partitions. The main idea of
these algorithms is to build an in-memory tree of the
target space-partitioning index. Then, data items are
recursively partitioned into disk-based buffers using the
in-memory tree. Although the second algorithm is designed
for bulk insertion, it can be used in bulk loading
as well. The proposed extensible algorithms are implemented
inside SP-GiST; a framework for supporting
the class of space-partitioning trees. Both algorithms
have I/O bound 0(NH/B), where N is the number of
data items to be bulk loaded/inserted, B is the number
of tree nodes that can fit in one disk page, H is the tree
height in terms of pages after applying a clustering algorithm.
Experimental results are provided to show the
scalability and applicability of the proposed algorithms
for the class of space-partitioning trees. A comparison
of the two proposed algorithms shows that the first algorithm
performs better in case of bulk loading. However
the second algorithm is more general and can be used
for efficient bulk insertion.
Description
(c) 2004 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Date
2004
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Research Projects
Organizational Units
Journal Issue
Keywords
Citation
T. M. Ghanem, R. Shah, M. F. Mokbel, W. G. Aref, and J. S. Vitter. “Bulk Operations for Space-Partitioning Trees,” Proceedings of the 20th Annual IEEE International Conference on Data Engineering (ICDE ’04), Boston, March–April 2004, 29–41.