Abstract
Genomic DNA is fragmented into segments using the Jensen-Shannon divergence. Use of this criterion results in the fragments being entropically homogeneous to within a predefined level of statistical significance. Application of this procedure is made to complete genomes of organisms from archaebacteria, eubacteria, and eukaryotes. The distribution of fragment lengths in bacterial and primitive eukaryotic DNAs shows two distinct regimes of power-law scaling. The characteristic length separating these two regimes appears to be an intrinsic property of the sequence rather than a finite-size artifact, and is independent of the significance level used in segmenting a given genome. Fragment length distributions obtained in the segmentation of the genomes of more highly evolved eukaryotes do not have such distinct regimes of power-law behavior.
- Received 8 October 2001
DOI:https://doi.org/10.1103/PhysRevE.65.051909
©2002 American Physical Society