Abstract
DNA sequences of higher organisms contain thousands of nearly identical dispersed repetitive sequences. In order to understand the effect of such repeats on word entropies, we construct a model that can be analyzed analytically. The hypothetical model sequences consist of independent equidistributed symbols with randomly interspersed repeats. As a conclusion, we predict that the entropy of DNA sequences measuring the information content is much lower than suggested by earlier empirical studies.
- Received 23 May 1994
DOI:https://doi.org/10.1103/PhysRevE.50.5061
©1994 American Physical Society