Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties

Physics – Condensed Matter – Disordered Systems and Neural Networks

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

17 pages, 12 figures. Accepted for publication in EPJ B

Scientific paper

10.1007/s100510050616

The Longest Common Subsequence (LCS) Problem asks for the longest sequence of (non-contiguous) matches between two given strings of characters. Using extensive Monte Carlo simulations, we find a finite size scaling law of the form E(L)/N =C + A/(N^1/2 ln N)+... for the mean LCS length of two random strings of size N over S letters. We provide precise estimates of C for S between 2 and 15. We consider also a related Bernoulli Matching model where the different entries of an N times M array are independently occupied with probability 1/S. In that case we find the expression of the limit of L(N,M)/N as N grows to infinity, as a function of r=M/N. This expression provides a very good approximation for the Random String model, which gets more and more accurate as S increases. The question of the ``universality class'' of the LCS problem is also considered. For the Bernoulli Matching model we find very good agreement with recent scaling predictions of Hwa and Lassig for Needleman-Wunsch sequence alignment. We find however that the variance of the LCS length has a different scaling different in the Random String model, suggesting that long-ranged correlations among the matches are relevant in this model. We finally study the ``ground state'' properties of this problem. We find in particular that the number of solutions typically grows exponentially with N, i.e. this system has a residual entropy at T=0. Also the overlap between two LCSs chosen at random is found to be self averaging and to aproach a definite value q(S)<1 as N grows.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-129966

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.