Compressed Index for Dictionary Matching (extended abstract)

Hon, Wing-Kai; Lam, Tak-Wah; Shah, Rahul; Siu-Lung, Tam; Vitter, Jeffrey Scott

dc.contributor.author	Hon, Wing-Kai
dc.contributor.author	Lam, Tak-Wah
dc.contributor.author	Shah, Rahul
dc.contributor.author	Siu-Lung, Tam
dc.contributor.author	Vitter, Jeffrey Scott
dc.date.accessioned	2011-03-21T21:26:44Z
dc.date.available	2011-03-21T21:26:44Z
dc.date.issued	2008
dc.identifier.citation	W.-K. Hon, T.-W. Lam, R. Shah, S.-L. Tam, and J. S. Vitter. “Compressed Index for Dictionary Matching,” in preparation. An extended abstract appears in Proceedings of the 2008 IEEE Data Compression Conference (DCC ’08), Snowbird, UT, March 2008, 23-32. http://dx.doi.org/10.1109/DCC.2008.62
dc.identifier.uri	http://hdl.handle.net/1808/7232
dc.description	(c) 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
dc.description.abstract	The past few years have witnessed several exciting results on compressed represen- tation of a string T that supports e±cient pattern matching, and the space complexity has been reduced to jTjHk(T)+o(jTj log ¾) bits [8, 10], where Hk(T) denotes the kth- order empirical entropy of T, and ¾ is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection D of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found e±ciently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log ¾))-bit index with only a small sacri¯ce in search time.
dc.language.iso	en_US
dc.publisher	IEEE
dc.title	Compressed Index for Dictionary Matching (extended abstract)
dc.type	Article
kusw.kuauthor	Vitter, Jeffrey Scott
kusw.oastatus	fullparticipation
dc.identifier.doi	10.1109/DCC.2008.62
kusw.oaversion	Scholarly/refereed, author accepted manuscript
kusw.oapolicy	This item meets KU Open Access policy criteria.
dc.rights.accessrights	openAccess

Files in this item

Name:: HLS08compressedindex.pdf
Size:: 175.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.