Abstract
How DNA is mapped to functional proteins is a basic question of living matter. We introduce and study a physical model of protein evolution which suggests a mechanical basis for this map. Many proteins rely on large-scale motion to function. We therefore treat protein as learning amorphous matter that evolves towards such a mechanical function: Genes are binary sequences that encode the connectivity of the amino acid network that makes a protein. The gene is evolved until the network forms a shear band across the protein, which allows for long-range, soft modes required for protein function. The evolution reduces the high-dimensional sequence space to a low-dimensional space of mechanical modes, in accord with the observed dimensional reduction between genotype and phenotype of proteins. Spectral analysis of the space of solutions shows a strong correspondence between localization around the shear band of both mechanical modes and the sequence structure. Specifically, our model shows how mutations are correlated among amino acids whose interactions determine the functional mode.
6 More- Received 12 August 2016
DOI:https://doi.org/10.1103/PhysRevX.7.021037
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society
Physics Subject Headings (PhySH)
Popular Summary
Proteins are complex molecules that perform a range of essential tasks in a living cell. Each protein is a chain of a few hundred basic building blocks known as amino acids that are assembled according to a blueprint written in a cell’s DNA. While the structure of individual proteins is well understood, the secret of how a simple DNA gene—a long “word” that lists the amino acids—encodes the intricate structure and biochemistry of a protein remains elusive. Our paper offers a simple answer: A protein’s function relies on how it moves, and this large-scale motion is exactly what is coded for by the DNA.
We develop a simple mathematical model of a protein, where the gene is a binary sequence that determines the connectivity of the amino acid network that makes the protein. In our simulations, the gene evolves such that the protein network moves in a specific fashion, which facilitates the function of the protein. Our model is easy to calculate, and this allows us to repeat the evolutionary search millions of times. This massive statistic allows us to explain the fundamental properties of the relation between DNA and proteins. In particular, it explains why biochemical function is described by only a few parameters, a number that is much smaller than the length of the gene or the number of amino acids in the protein.
Our model shows how the parameters for a protein’s function are encoded in specific parts of the gene sequence that are surrounded by large chunks that record mostly random evolution. Understanding proteins as evolvable soft machines may open possibilities to tweak and engineer nanomachines with new biological functions.