Loading...
Thumbnail Image
Publication

Faster Compressed Dictionary Matching (extended abstract)

Hon, Wing-Kai
Lam, Tak-Wah
Shah, Rahul
Siu-Lung, Tam
Vitter, Jeffrey Scott
Citations
Altmetric:
Abstract
The past few years have witnessed several exciting results on compressed represen- tation of a string T that supports e±cient pattern matching, and the space complexity has been reduced to jTjHk(T)+o(jTj log ¾) bits [8, 10], where Hk(T) denotes the kth- order empirical entropy of T, and ¾ is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection D of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found e±ciently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log ¾))-bit index with only a small sacri¯ce in search time.
Description
The original publication is available at www.springerlink.com
Date
2010
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Verlag
Research Projects
Organizational Units
Journal Issue
Keywords
Citation
W.-K. Hon, Tsung-Han Ku, R. Shah, S. V. Thankachan, and J. S. Vitter. “Faster Compressed Dictionary Matching,” in preparation. An extended abstract appears in Proceedings of the 17th International Conference on String Processing and Information Retrieval (SPIRE ’10), Los Cabos, Mexico, October 2010, published in Lecture Notes in Computer Science, Springer, Berlin, Germany. http://dx.doi.org/10.1007/978-3-642-16321-0_19
Embedded videos