• Open Access

Deep Learning Protein Conformational Space with Convolutions and Latent Interpolations

Venkata K. Ramaswamy, Samuel C. Musson, Chris G. Willcocks, and Matteo T. Degiacomi
Phys. Rev. X 11, 011052 – Published 15 March 2021
PDFHTMLExport Citation

Abstract

Determining the different conformational states of a protein and the transition paths between them is key to fully understanding the relationship between biomolecular structure and function. This can be accomplished by sampling protein conformational space with molecular simulation methodologies. Despite advances in computing hardware and sampling techniques, simulations always yield a discretized representation of this space, with transition states undersampled proportionally to their associated energy barrier. We present a convolutional neural network that learns a continuous conformational space representation from example structures, and loss functions that ensure intermediates between examples are physically plausible. We show that this network, trained with simulations of distinct protein states, can correctly predict a biologically relevant transition path, without any example on the path provided. We also show we can transfer features learned from one protein to others, which results in superior performances, and requires a surprisingly small number of training examples.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 8 June 2020
  • Revised 15 December 2020
  • Accepted 26 January 2021

DOI:https://doi.org/10.1103/PhysRevX.11.011052

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Physics of Living SystemsPolymers & Soft MatterInterdisciplinary Physics

Authors & Affiliations

Venkata K. Ramaswamy1,*, Samuel C. Musson1,*, Chris G. Willcocks2,‡, and Matteo T. Degiacomi1,†

  • 1Department of Physics, Durham University, Stockton Road, Durham DH1 3LE, United Kingdom
  • 2Department of Computer Science, Durham University, Stockton Road, Durham DH1 3LE, United Kingdom

  • *These authors contributed equally to this work.
  • Corresponding author. matteo.t.degiacomi@durham.ac.uk
  • Corresponding author. christopher.g.willcocks@durham.ac.uk

Popular Summary

Proteins carry out a range of essential biological functions including catalysis, sensing, motility, transport, and defense. These molecules function independently or in unison with other molecules, such as DNA, drugs, or other proteins. To perform these functions, a protein must often change its shape (or conformation), but identifying the possible conformations of a protein is not an easy task. We present a methodology that combines molecular simulation and machine learning to discover transition paths between protein conformational states.

Current experimental techniques provide a good picture of the most stable conformations and little to nothing on the transition path or intermediate states. Despite the importance of these intermediate conformations for pharmaceutical drug targeting purposes, determining them with high reliability remains a challenging problem.

We overcome this hurdle with a neural network that can generate new protein conformations after being trained with examples from experiments or molecular simulations. The network is capable of generating structures that respect physical laws and features the first universal architecture, capable of handling any protein without having to be modified.

Our network can predict a physically realistic and biologically relevant transition path between different conformations of the protein MurD, a potential antibacterial drug target. The network is also capable of transfer learning, whereby its training with a sparse dataset is facilitated by pretraining on a different, larger dataset. This opens the door to a new class of computational methods capable of characterizing simultaneously every existing protein.

Key Image

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand
Issue

Vol. 11, Iss. 1 — January - March 2021

Subject Areas
Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review X

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 4.0 International license. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×