Physics – Condensed Matter – Disordered Systems and Neural Networks
Scientific paper
1998-09-21
Eur. Phys. J. B 7, pp. 293-308 (1999)
Physics
Condensed Matter
Disordered Systems and Neural Networks
17 pages, 12 figures. Accepted for publication in EPJ B
Scientific paper
10.1007/s100510050616
The Longest Common Subsequence (LCS) Problem asks for the longest sequence of (non-contiguous) matches between two given strings of characters. Using extensive Monte Carlo simulations, we find a finite size scaling law of the form E(L)/N =C + A/(N^1/2 ln N)+... for the mean LCS length of two random strings of size N over S letters. We provide precise estimates of C for S between 2 and 15. We consider also a related Bernoulli Matching model where the different entries of an N times M array are independently occupied with probability 1/S. In that case we find the expression of the limit of L(N,M)/N as N grows to infinity, as a function of r=M/N. This expression provides a very good approximation for the Random String model, which gets more and more accurate as S increases. The question of the ``universality class'' of the LCS problem is also considered. For the Bernoulli Matching model we find very good agreement with recent scaling predictions of Hwa and Lassig for Needleman-Wunsch sequence alignment. We find however that the variance of the LCS length has a different scaling different in the Random String model, suggesting that long-ranged correlations among the matches are relevant in this model. We finally study the ``ground state'' properties of this problem. We find in particular that the number of solutions typically grows exponentially with N, i.e. this system has a residual entropy at T=0. Also the overlap between two LCSs chosen at random is found to be self averaging and to aproach a definite value q(S)<1 as N grows.
No associations
LandOfFree
Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space properties will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-129966