Simplified amino acid alphabets based on deviation of conditional probability from random background

Xin Liu, Di Liu, Ji Qi, and Wei-Mou Zheng
Phys. Rev. E 66, 021906 – Published 23 August 2002
PDFExport Citation

Abstract

The primitive data for deducing the Miyazawa-Jernigan contact energy or blocks substitution matrix (BLOSUM) consists of pair frequency counts. Each amino acid corresponds to a conditional probability distribution. Based on the deviation of such a conditional probability from random background, a scheme for the reduction of the amino acid alphabet is proposed. It is observed that an evident discrepancy exists between the reduced alphabets obtained from the raw data of the Miyazawa-Jernigan’s and BLOSUM’s residue pair counts. Taking a homologous sequence database SCOP40 as a test set, we detect homology with the obtained coarse-grained substitution matrices. It is verified that the reduced alphabets obtained well preserve information contained in the original 20-letter alphabet.

  • Received 30 December 2001

DOI:https://doi.org/10.1103/PhysRevE.66.021906

©2002 American Physical Society

Authors & Affiliations

Xin Liu1, Di Liu2, Ji Qi1, and Wei-Mou Zheng1

  • 1Institute of Theoretical Physics, China, Beijing 100080, China
  • 2Center of Bioinformations at Peking University, Beijing 100871, China

References (Subscription Required)

Click to Expand
Issue

Vol. 66, Iss. 2 — August 2002

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×