Abstract
We study the statistical properties of contact vectors, a construct to characterize a protein’s structure. The contact vector of an N-residue protein is a list of N integers representing the number of residues in contact with residue i. We study analytically (at mean-field level) and numerically the amount of structural information contained in a contact vector. Analytical calculations reveal that a large variance in the contact numbers reduces the degeneracy of the mapping between contact vectors and structures. Exact enumeration for lengths up to on the three-dimensional cubic lattice indicates that the growth rate of number of contact vectors as a function of N is only 3% less than that for contact maps. In particular, for compact structures we present numerical evidence that, practically, each contact vector corresponds to only a handful of structures. We discuss how this information can be used for better structure prediction.
- Received 1 September 2001
DOI:https://doi.org/10.1103/PhysRevE.65.041904
©2002 American Physical Society