Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models

Magnus Ekeberg, Cecilia Lövkvist, Yueheng Lan, Martin Weigt, and Erik Aurell
Phys. Rev. E 87, 012707 – Published 11 January 2013

Abstract

Spatially proximate amino acids in a protein tend to coevolve. A protein's three-dimensional (3D) structure hence leaves an echo of correlations in the evolutionary record. Reverse engineering 3D structures from such correlations is an open problem in structural biology, pursued with increasing vigor as more and more protein sequences continue to fill the data banks. Within this task lies a statistical inference problem, rooted in the following: correlation between two sites in a protein sequence can arise from firsthand interaction but can also be network-propagated via intermediate sites; observed correlation is not enough to guarantee proximity. To separate direct from indirect interactions is an instance of the general problem of inverse statistical mechanics, where the task is to learn model parameters (fields, couplings) from observables (magnetizations, correlations, samples) in large systems. In the context of protein sequences, the approach has been referred to as direct-coupling analysis. Here we show that the pseudolikelihood method, applied to 21-state Potts models describing the statistical properties of families of evolutionarily related proteins, significantly outperforms existing approaches to the direct-coupling analysis, the latter being based on standard mean-field techniques. This improved performance also relies on a modified score for the coupling strength. The results are verified using known crystal structures of specific sequence instances of various protein families. Code implementing the new method can be found at http://plmdca.csc.kth.se/.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
9 More
  • Received 23 October 2012

DOI:https://doi.org/10.1103/PhysRevE.87.012707

©2013 American Physical Society

Authors & Affiliations

Magnus Ekeberg1, Cecilia Lövkvist2, Yueheng Lan3, Martin Weigt4,*, and Erik Aurell2,5,6,†

  • 1Engineering Physics Program, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
  • 2Department of Computational Biology, Alba Nova University Center, 106 91 Stockholm, Sweden
  • 3Department of Physics, Tsinghua University, Beijing 100084, P.R. China
  • 4Université Pierre et Marie Curie, UMR7238—Laboratoire de Génomique des Microorganismes, 15 rue de l'Ecole de Médecine, 75006 Paris, France
  • 5ACCESS Linnaeus Center, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
  • 6Aalto University, Department of Information and Computer Science, PO Box 15400, FI-00076 Aalto, Finland

  • *martin.weigt@upmc.fr
  • eaurell@kth.se

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 87, Iss. 1 — January 2013

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×