Abstract
Several processes in the cell, such as gene regulation, start when key proteins recognize and bind to short DNA sequences. However, as these sequences can be hundreds of million times shorter than the genome, they are hard to find by simple diffusion: diffusion-limited association rates may underestimate in vitro measurements up to several orders of magnitude. Moreover, the rates increase if the DNA is coiled rather than straight. Here we model how this works in vivo in mammalian cells. We use chromatin-chromatin contact data from Hi-C experiments to map the protein target-search onto a network problem. The nodes represent DNA segments and the weight of the links are proportional to measured contact probabilities. We then put forward a diffusion-reaction equation for the density of searching protein that allows us to calculate the association rates across the genome analytically. For segments where the rates are high, we find that they are enriched with active gene starts and have high RNA expression levels. This paper suggests that the DNA's 3D conformation is important for protein search times in vivo and offers a method to interpret protein-binding profiles in eukaryotes that cannot be explained by the DNA sequence itself.
5 More- Received 19 September 2019
- Accepted 16 November 2020
DOI:https://doi.org/10.1103/PhysRevResearch.3.013055
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI. Funded by Bibsam.
Published by the American Physical Society