• Featured in Physics
  • Open Access

Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

Long Qian and Edo Kussell
Phys. Rev. X 6, 041009 – Published 14 October 2016
Physics logo See Focus story: Evolution Thins Out Distracting DNA
PDFHTMLExport Citation

Abstract

The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 24 February 2016

DOI:https://doi.org/10.1103/PhysRevX.6.041009

Published by the American Physical Society under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Physics of Living Systems

Focus

Key Image

Evolution Thins Out Distracting DNA

Published 14 October 2016

Proteins sometimes bind to the wrong stretch of DNA, but these "imposter" DNA sequences are statistically rare in many genomes, suggesting that evolution works against them.

See more in Physics

Authors & Affiliations

Long Qian1 and Edo Kussell1,2

  • 1Department of Biology and Center for Genomics and Systems Biology, New York University, New York, New York 10003, USA
  • 2Department of Physics, New York University, New York, New York 10003, USA

Popular Summary

Nonfunctional DNA binding by proteins can disrupt basic cellular processes such as transcription, gene regulation, replication, and mutational repair. By evolving their global composition of short DNA “words,” genomes could potentially reduce the frequency of nonfunctional binding. Here, we show that such an evolutionary process imposes global constraints in all genomes and operates via tiny steps over time scales spanning millions of generations.

In this study, we analyze DNA-binding proteins in 947 bacterial or archaeal genomes and the genomes of 75 eukaryotic species. We determine the global constraints that are set by the distinct complement of DNA-binding proteins present in each genome that are responsible for preserving ancient phylogenetic signals. Using a mathematical model, we show that a distinctive genomic signature of such constraints is preserved in the genomes of various divergent species, for example, between D. melanogaster (the common fruit fly) and M. musculus (the house mouse), species that diverged over 600 million years ago. Our analysis demonstrates that weak binding sites in genomes are preferentially avoided, a result that holds true across the domains of life. Put another way, we show that the global word composition of each genome has been molded by its DNA-binding proteins over the course of evolution.

The outcomes of our study, which reveal that a large number of small effects act collectively to maintain genomic binding landscapes over long evolutionary time scales, pave the way for investigations of how this general evolutionary mechanism impacts a wide range of cellular processes.

Key Image

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand
Issue

Vol. 6, Iss. 4 — October - December 2016

Subject Areas
Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review X

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 3.0 License. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×