Show simple item record

dc.contributor.authorGrossi, Roberto
dc.contributor.authorGupta, Ankur
dc.contributor.authorVitter, Jeffrey Scott
dc.date.accessioned2011-03-18T19:01:12Z
dc.date.available2011-03-18T19:01:12Z
dc.date.issued2003
dc.identifier.citationR. Grossi, A. Gupta, and J. S. Vitter. “High-Order Entropy-Compressed Text Indexes,” being partitioned with more recent work for submission to journals. An extended abstract appears in Proceedings of the 14th Annual SIAM/ACM Symposium on Discrete Algorithms (SODA ’03), Baltimore, MD, January 2003, 841–850.
dc.identifier.urihttp://hdl.handle.net/1808/7192
dc.description.abstractWe present a novel implementation of compressed su~x arrays exhibiting new tradeoffs between search time and space occupancy for a given text (or sequence) of n symbols over an alphabet E, where each symbol is encoded by lg ]E I bits. We show that compressed su1~x arrays use just nHh + O(n lglg n~ lgl~ I n) bits, while retaining full text indexing functionalities, such as searching any pattern sequence of length m in O(mlg [E[ + polylog(n)) time. The term Hh < lg IEI denotes the hth-order empirical entropy of the text, which means that our index is nearly optimal in space apart from lower-order terms, achieving asymptotically the empirical entropy of the text (with a multiplicative constant 1). If the text is highly compressible so that H~ = o(1) and the alphabet size is small, we obtain a text index with o(m) search time that requires only o(n) bits. Further results and tradeoffs are reported in the paper.
dc.language.isoen_US
dc.publisherSociety for Industrial and Applied Mathematics Philadelphia
dc.relation.isversionofhttp://portal.acm.org/citation.cfm?id=644250
dc.titleHigh-Order Entropy-Compressed Text Indexes
dc.typeArticle
kusw.kuauthorVitter, Jeffrey Scott
kusw.oastatusfullparticipation
kusw.oaversionScholarly/refereed, publisher version
kusw.oapolicyThis item meets KU Open Access policy criteria.
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record