Indexing Schemes for Similarity Search In Datasets of Short Protein Fragments

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

34 pages, 12 figures, 4 tables - Timings for experiments added upon referees' request, and a number of less substantial modifi

Scientific paper

We propose a family of very efficient hierarchical indexing schemes for ungapped, score matrix-based similarity search in large datasets of short (4-12 amino acid) protein fragments. This type of similarity search has importance in both providing a building block to more complex algorithms and for possible use in direct biological investigations where datasets are of the order of 60 million objects. Our scheme is based on the internal geometry of the amino acid alphabet and performs exceptionally well, for example outputting 100 nearest neighbours to any possible fragment of length 10 after scanning on average less than one per cent of the entire dataset.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Indexing Schemes for Similarity Search In Datasets of Short Protein Fragments does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Indexing Schemes for Similarity Search In Datasets of Short Protein Fragments, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Indexing Schemes for Similarity Search In Datasets of Short Protein Fragments will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-236918

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.