Computer Science – Data Structures and Algorithms
Scientific paper
2011-08-02
Computer Science
Data Structures and Algorithms
12 pages
Scientific paper
Let $\D = $$ \{d_1,d_2,...d_D\}$ be a given set of $D$ string documents of total length $n$, our task is to index $\D$, such that the $k$ most relevant documents for an online query pattern $P$ of length $p$ can be retrieved efficiently. We propose an index of size $|CSA|+n\log D(2+o(1))$ bits and $O(t_{s}(p)+k\log\log n+poly\log\log n)$ query time for the basic relevance metric \emph{term-frequency}, where $|CSA|$ is the size (in bits) of a compressed full text index of $\D$, with $O(t_s(p))$ time for searching a pattern of length $p$ . We further reduce the space to $|CSA|+n\log D(1+o(1))$ bits, however the query time will be $O(t_s(p)+k(\log \sigma \log\log n)^{1+\epsilon}+poly\log\log n)$, where $\sigma$ is the alphabet size and $\epsilon >0$ is any constant.
Hon Wing-Kai
Shah Rahul
Thankachan Sharma V.
No associations
LandOfFree
Towards an Optimal Space-and-Query-Time Index for Top-k Document Retrieval does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Towards an Optimal Space-and-Query-Time Index for Top-k Document Retrieval, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Towards an Optimal Space-and-Query-Time Index for Top-k Document Retrieval will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-509910