Indexing Finite Language Representation of Population Genotypes

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

This is the full version of the paper that was presented at WABI 2011. The implementation is available at http://www.cs.hels

Scientific paper

10.1007/978-3-642-23038-7_23

With the recent advances in DNA sequencing, it is now possible to have complete genomes of individuals sequenced and assembled. This rich and focused genotype information can be used to do different population-wide studies, now first time directly on whole genome level. We propose a way to index population genotype information together with the complete genome sequence, so that one can use the index to efficiently align a given sequence to the genome with all plausible genotype recombinations taken into account. This is achieved through converting a multiple alignment of individual genomes into a finite automaton recognizing all strings that can be read from the alignment by switching the sequence at any time. The finite automaton is indexed with an extension of Burrows-Wheeler transform to allow pattern search inside the plausible recombinant sequences. The size of the index stays limited, because of the high similarity of individual genomes. The index finds applications in variation calling and in primer design. On a variation calling experiment, we found about 1.0% of matches to novel recombinants just with exact matching, and up to 2.4% with approximate matching.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Indexing Finite Language Representation of Population Genotypes does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Indexing Finite Language Representation of Population Genotypes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Indexing Finite Language Representation of Population Genotypes will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-226732

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.