Computer Science – Data Structures and Algorithms
Scientific paper
2011-10-20
Computer Science
Data Structures and Algorithms
Scientific paper
We introduce the first grammar-compressed representation of a sequence that supports searches in time that depends only logarithmically on the size of the grammar. Given a text $T[1..u]$ that is represented by a (context-free) grammar of $n$ (terminal and nonterminal) symbols and size $N$ (measured as the sum of the lengths of the right hands of the rules), a basic grammar-based representation of $T$ takes $N\lg n$ bits of space. Our representation requires $2N\lg n + N\lg u + \epsilon\, n\lg n + o(N\lg n)$ bits of space, for any $0<\epsilon \le 1$. It can find the positions of the $occ$ occurrences of a pattern of length $m$ in $T$ in $O((m^2/\epsilon)\lg (\frac{\lg u}{\lg n}) +occ\lg n)$ time, and extract any substring of length $\ell$ of $T$ in time $O(\ell+h\lg(N/h))$, where $h$ is the height of the grammar tree.
Claude Francisco
Navarro Gonzalo
No associations
LandOfFree
Improved Grammar-Based Compressed Indexes does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Improved Grammar-Based Compressed Indexes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Improved Grammar-Based Compressed Indexes will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-551620