SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Corresponding authors: Zhanshan (Sam) Ma, ma@vandals.uidaho.edu; Mihai Pop, mpop@umiacs.umd.edu || Availability: Programs in b

Scientific paper

Motivation: To tackle the problem of huge memory usage associated with de Bruijn graph-based algorithms, upon which some of the most widely used de novo genome assemblers have been built, we released SparseAssembler1. SparseAssembler1 can save as much as 90% memory consumption in comparison with the state-of-art assemblers, but it requires rounds of denoising to accurately assemble genomes. In this paper, we introduce a new general model for genome assembly that uses only sparse k-mers. The new model replaces the idea of the de Bruijn graph from the beginning, and achieves similar memory efficiency and much better robustness compared with our previous SparseAssembler1. Results: Based on the sparse k-mers graph model, we develop SparseAssembler2. We demonstrate that the decomposition of reads of all overlapping k-mers, which is used in existing de Bruijn graph genome assemblers, is overly cautious. We introduce a sparse k-mer graph structure for saving sparse k-mers, which greatly reduces memory space requirements necessary for de novo genome assembly. In contrast with the de Bruijn graph approach, we devise a simple but powerful strategy, i.e., finding links between the k-mers in the genome and traversing following the links, which can be done by saving only a few k-mers. To implement the strategy, we need to only select some k-mers that may not even be overlapping ones, and build the links between these k-mers indicated by the reads. We can traverse through this sparse k-mer graph to build the contigs, and ultimately complete the genome assembly. Since the new sparse k-mers graph shares almost all advantages of de Bruijn graph, we are able to adapt a Dijkstra-like breadth-first search algorithm, for the new sparse k-mer graph in order to circumvent sequencing errors and resolve polymorphisms.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-541831

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.