Score distributions of gapped multiple sequence alignments down to the low-probability tail

Pascal Fieth and Alexander K. Hartmann
Phys. Rev. E 94, 022127 – Published 19 August 2016

Abstract

Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
6 More
  • Received 8 December 2015

DOI:https://doi.org/10.1103/PhysRevE.94.022127

©2016 American Physical Society

Physics Subject Headings (PhySH)

  1. Physical Systems
Statistical Physics & Thermodynamics

Authors & Affiliations

Pascal Fieth* and Alexander K. Hartmann

  • Institut für Physik, Carl von Ossietzky Universität Oldenburg, D-26111 Oldenburg, Germany

  • *pascal.fieth@uni-oldenburg.de

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 94, Iss. 2 — August 2016

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×