Computer Science – Data Structures and Algorithms
Scientific paper
2010-03-09
Computer Science
Data Structures and Algorithms
Scientific paper
Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories -- based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In Jackson et. al. ICPP-2008, an $O(n/p)$ time parallel algorithm has been given for this problem. Here $n$ is the size of the input and $p$ is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating $\Theta(n\Sigma)$ messages. In this paper we present a $\Theta(n/p)$ time parallel algorithm with a communication complexity equal to that of parallel sorting and is not sensitive to $\Sigma$. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of $\Theta(\frac{n\log(n/B)}{B\log(M/B)})$. We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with that of Jackson et. al. ICPP-2008 reveals that our algorithm is faster. We also provide efficient algorithms for the bi-directed chain compaction problem.
Dinh Hieu
Kundeti Vamsi
Rajasekaran Sanguthevar
No associations
LandOfFree
Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-455496