Global statistical analysis of the protein homology network

Biology – Quantitative Biology – Quantitative Methods

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

15 pages, 15 figures

Scientific paper

The similarity between protein sequences is a directly and easly computed quantity from which to deduce information about their evolutionary distance and to detect homologous proteins. The SIMAP database -- Similarity Matrix of Proteins -- provides a pre-computed similarity matrix covering the similarity space formed by about all publicly available amino acid sequences from public databases and completely sequenced genomes. From SIMAP we construct the protein homology network, where the proteins are the nodes and the links represent homology relationships. With more than 5 million nodes and about 70 10^9 edges it is the greatest protein homology network ever been builded. We describe the basic features and we perform a global statistical analysis of the network. Starting from the Smith-Waterman similarity score, we define for each edge a weight w to measure the similarity distance between two nodes. Keeping only edges with a weigth greater than a minimal w_min, and by varying w_min we build a family of networks with different degree of similarity. We investigate the distribution of connected components (clusters) of the networks at different w_min and in particular we find a behaviour similar to a phase transition guided by the formation of a giant component. Moreover we study selected sequence features and protein domains of protein pairs that connect different clusters in the networks at different level of similarity. We observed specific, non-random distributions of the protein features and domains for proteins connecting clusters at certain weight intervals.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Global statistical analysis of the protein homology network does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Global statistical analysis of the protein homology network, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Global statistical analysis of the protein homology network will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-68895

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.