One-dimensional nature of protein low-energy vibrations

Protein internal dynamics is crucial for its function. In particular, low-energy vibrational modes at 1–10 meV play important roles in the transportation of energy inside the protein molecule, and facilitate its enzymatic function and binding to ligands and other biomolecules. However, the microscopic spatiotemporal details of these modes have remained largely unknown, due to limitations of the experimental techniques. Here, by applying inelastic neutron scattering on a perdeuterated protein, we demonstrate that these vibration modes are correlated primarily through peptide bonds rather than noncovalent interactions (including hydrogen bonds), which is further conﬁrmed by a complementary molecular dynamics simulation. More importantly, the complex spatiotemporal features of interatomic vibrations observed in an all-atom simulation are qualitatively reproduced by an ultrasimple toy model, a one-dimensional harmonic chain, where the vibrations propagate along the peptide chain, but are conﬁned and damped by noncovalent interactions surrounding the chain. Our ﬁndings are fundamentally important for understanding many functional processes in proteins that are strongly coupled to low-energy vibrational modes. Moreover, the one-dimensional nature of low-energy vibrations discovered here should be applicable to other biomacromolecules and many ordinary polymeric materials


I. INTRODUCTION
Protein vibrations span a wide energy window, from high-energy (∼1000 meV, femtosecond) local bond stretching to low-energy (∼1 meV, picosecond) collective modes [1,2].Low-energy collective vibrations are crucial for the transportation of thermal energy and perturbation in protein molecules [3,4].An example is transportation of perturbation starting from hydration water on the protein surface to amino acids deep inside the biomolecule, which impacts its enzymatic function [4].These modes also play an important role in the lowering of energy barriers and speeding up the catalytic reactions [5] of protein molecules, and facilitating large-scale conformational movements [6], as well as binding to ligands and other biomolecule [7,8], etc.
Coherent neutron scattering directly probes interatomic dynamic correlations [18][19][20], and therefore constitutes a valuable experimental approach to the low-energy vibration of proteins.Nonetheless, neutron experiments, performed on ordinary proteins, which are full of hydrogen atoms, measure predominantly the self-motions of hydrogen atoms [7,9] due to their ultralarge incoherent scattering cross sections, and thus obscure interatomic correlations.Here, to characterize interatomic vibrations, we applied inelastic neutron scattering on fully deuterated proteins, which has been rarely conducted [20].Our experiment reveals that the vibrations in an energy window of 1-10 meV are primarily correlated through peptide bonds rather than a secondary structural motif linked by hydrogen bonds.A complementary molecular dynamics (MD) simulation confirmed these experimental findings.More importantly, we found that a toy model of a one-dimensional harmonic chain confined and damped by neighboring residues can fully reproduce the complex spatiotemporal features of interatomic vibrations observed in the all-atom simulation.Our study therefore demonstrates the one-dimensional nature of low-energy vibrations in proteins.

II. NEUTRON EXPERIMENT ON PROTEIN VIBRATIONS
The neutron scattering experiments were conducted on cytochrome P450 (CYP101).P450's are an important enzyme family catalyzing a variety of biochemical reactions involved in carcinogenesis, drug metabolism, lipid and steroid biosynthesis, and degradation of pollutants in higher organisms [21].The atomic structure of the protein is illustrated in Fig. 1(a).Both hydrogenated CYP101 and its perdeuterated counterpart were measured using time-of-flight neutron spectroscopy (LET) at ISIS, UK.These two samples are referred to as H-CYP and D-CYP, respectively.Detailed experimental procedures are supplied in the Supplemental Material (SM) [22].The experimentally measured quantity is the dynamic structure factor (DSF), S(q, E ), which characterizes the density of dynamic modes as a function of energy transfer E and wave vector q.As shown in Eqs.(S4) and (S5) in SM [22], DSF is a sum of an incoherent component S inc (q, E ) and a coherent component S coh (q, E ).The neutron signals from the H-protein are primarily incoherent (∼90%), characterizing self-correlations of hydrogen atoms, whereas the ones from the D-protein are mostly coherent (∼90%), measuring mostly interatomic correlations of the protein's heavy atoms [23,24].
As displayed in Figs.1(b) and 1(c) and the corresponding insets, DSF measured on both H-CYP and D-CYP exhibits a prominent inelastic peak around 3 meV, which corresponds to a timescale ∼1.5 ps.The peak is located at 2.7 meV in D-CYP, slightly lower than that in H-CYP (3.1 meV), which might result from the fact that the hydrogen atoms in H-CYP vibrate faster due to their lighter weight as compared to D atoms in D-CYP.The collective nature of a dynamic mode can be determined by analyzing the q dependence of the intensity of the coherent DSF [18][19][20]23,25].An in-phase, collective mode at a given energy E 0 gives S coh (q, E 0 ) ∼ I (q)q 2 , whereas an uncorrelated mode gives S coh (q, E 0 ) ∼ q 2 [18][19][20]23,25].Here, I (q) is the static structure factor, providing the spatial correlation between atoms.In the present study, we integrated the dynamical structure factor measured on D-CYP at 5 K over the energy window from −10 to 10 meV, and used its q dependence to approximate the experimental static structure factor, denoted as I a (q).This approximation should work reasonably well as such a low temperature will dramatically suppress the high-energy modes beyond 10 meV, which is further supported by the simulation results [see Fig. S2(c)].As shown in Fig. 1(d), the so-obtained I a (q) contains two structural peaks, located at 1.4 Å −1 (peak I) and 2.7 Å −1 (peak II), respectively.Peak I corresponds to a length scale of 4-6 Å, which is the typical distance of secondary structural motifs in the protein.For example, the pitch of α-helix is about 5.4 Å and the inter-β-band distance is ∼5 Å [10,26].These secondary structural motifs are formed due to specific inter-residue hydrogen bonds. Pak II corresponds to 2-4 Å, which is the typical distance between two neighboring residues linked by a peptide bond.In Fig. S1, we further show that peak II of the calculated S(q) diminishes drastically when removing every second residue along the peptide chain, while peak I remain almost intact.This verifies that peak II results from correlations between nearest neighbors along the peptide chain.
As shown in Figs.1(e) and 1(f), at both 2.7 and 7.0 meV, the DSF measured on D-CYP, which is dominated by coherent scattering, exhibits a pronounced hump at ∼2.7 Å −1 (peak II), but not at 1.4 Å −1 (peak I).This indicates that the vibration modes in this energy range are primarily correlated through peptide bonds rather than the secondary structural motifs linked by hydrogen bonds.It is consistent with the result of Ref. [27], which demonstrated that the propagation of external thermal perturbation in a peptide helix is primarily through the peptide bond instead of the inter-residue hydrogen bonds.Our finding also agrees with that of a recent neutron scattering experiment on perdeuterated green fluorescent protein [20], which also reported the low cooperativity of low-energy vibrations at peak I. Note, however, Ref. [20] did not probe q above 2 Å −1 , and hence did not see the strongly correlated vibrations at peak II [Figs.1(e) and 1(f)].
As a control, we also analyzed the q dependence of DSF measured on H-CYP at both 2.7 and 7 meV, which is dominated by incoherent scattering.As shown in Figs.1(e) and 1(f), it increases roughly monotonically with q, with no visible peak in the range.This is expected for self-atomic correlations [18,20,25].FIG. 2. MD-derived neutron spectra.(a) S coh (q, E ) and (b) S inc (q, E ) at 120 K, where the results were summed up in the q range from 1 to 4 Å −1 , being consistent with the experiment.(c) The static structure factor of D-CYP calculated from a single protein molecule using the crystal structure.q dependence of the (d) S coh (q, E 0 ) and (e) S inc (q, E 0 ) at 2.7 and 7.0 meV.The calculated S coh (q, E 0 = 7 meV) is multiplied by a factor of 12 for a better comparison of the shape.

III. MD-DERIVED NEUTRON SPECTRA
The neutron spectra calculated from the complementary molecular dynamics (MD) simulations were presented in Fig. 2. Detailed MD protocols are supplied in the Supplemental Material [22].As evident by Figs.2(a) and 2(b), both the incoherent and coherent DSFs show prominent inelastic peaks around 2-4 meV.S(q) [Fig.2(c)] also contains peaks I and II.The q dependence of the coherent DSF S coh (q, E 0 ) at fixed E 0 (2.7 and 7.0 meV) presents a pronounced peak at 3 Å −1 (peak II) alongside a tiny hump at around 1.4 Å −1 (peak I) [Figs.2(d) and 2(e)], implying that these vibrational modes are correlated mostly through peptide bonds.In contrast, the incoherent DSF varies monotonically with q [see Figs.2(d) and 2(e)].Hence, the MD-derived neutron spectra (Fig. 2), both dynamical and static structure factors, are in qualitative agreement with the experimental ones (Fig. 1).

IV. TWO-PARTICLE DISPLACEMENT CORRELATION FUNCTION DERIVED FROM MD
To quantify the spatiotemporal features of protein vibration modes, we use MD trajectories to compute the two-particle displacement correlation function (TPDC), where a and b label residues, − → r a (t + t 0 ) and − → r a (t 0 ) are the residue positions at time t + t 0 and t 0 , respectively, while denotes the time average over the choice of t 0 .TPDC takes a large value if two residues are highly correlated, and becomes zero if they move independently [28,29].In Fig. 3(a), we compared TPDC at t * = 1.5 ps calculated for three types of residue pairs, which are respectively linked by To explore more microscopic details of correlated vibrations, we analyze TPDC as a function of distance between two residues.As shown in Fig. 3(b), C(t  * ) decreases monotonically and reaches 0 around r = 20 Å, which is close to the radius of the gyration of protein (21.4 Å).We also analyzed the dependence of TPDC on the "chemical distance," which is defined as the difference between the sequence numbers of two residues, e.g., r id = 4 when the two are linked along the peptide chain by three intermediate residues.As seen in Fig. 3(c), C(t  * ) decreases monotonically and reaches 0 around r id = 18, implying that correlation proceeds along the backbone for nearly 20 residues.
To further distinguish the roles played by peptide bonds and noncovalent interactions, we computed C(t * ) as a function of coordination shell numbers, separately for neighbors connected by peptide bonds and for neighbors connected by noncovalent interactions [see the illustration in Fig. 3(d)].The obtained results, displayed in Fig. 3(e), again demonstrate that correlations decay much slower for neighbors connected by peptide bonds.Hence it is primarily the peptide chain that propagates the correlated vibrations.To explore the temporal features of interatomic vibrations along the peptide chain, we analyzed C(t) as a function of t at fixed values of r id .As shown in Fig. 4(b), C(t) oscillates with t similar to an underdamped harmonic oscillator, and converge to a nonzero plateau C(∞) in the long-time limit.

V. TOY MODEL: CONFINED UNDERDAMPED HARMONIC CHAIN
Both our experimental and MD results indicate that protein vibrations on 1-10 meV propagate predominantly along the protein backbone.This discovery prompts us to model protein as a one-dimensional array of beads (residues) connected by springs (peptide bonds), and at the time confined and damped by neighboring residues through the noncovalent interactions [see Fig. 4(a)].The dynamics of this model can be described by a linear Langevin equation, where u i (t ) is the displacement of the ith bead from equilibrium, m the bead mass, and K the strength of peptide bonds.λ and K 0 are the friction coefficient and restoring force coefficient, both due to the confinement of neighboring residues, and finally ζ i (t ) is the white noise, whose variance is related to λ via the fluctuation dissipation theorem ζ i (t )ζ j (t ) = 2k B T λδ i j δ(t i − t j ).Note that the latter three terms, i.e., friction, restoring, and thermal noise, are all due to noncovalent interactions.
TPDC of this model can be calculated analytically.With details relegated to SM [22], we find where ω c = λ 2m and (y) = y 2 K m + K 0 m − λ 2 4m 2 .By fitting Eq. ( 3) to MD-derived TPDC at different r id [see Fig. 4(b)], we obtained the best-fitting parameters m ∼ 103 g/mol, K 0 ∼ 0.78 N/m, K ∼ 13.7 N/m, and λ ∼ 2.7 × 10 −13 kg/s.Note that K 0 is smaller than K by ∼20 times, quantitatively confirming the result of the elastic network model [17], which assumes distinct force constants for the peptide-bonded and nonbonded residue pairs in the protein.The huge difference between K 0 and K also naturally explains why vibrations propagate primarily along the backbone [cf.Fig. 3(e)].Moreover, the best-fitting values of λ and m agree quantitatively with the protein internal friction coefficient (λ ∼ 2 × 10 −13 kg/s) (λ ∼ 2 × 10 −13 kg/s reported in Ref. [30], and the average mass of one protein residue (114 g/mol), respectively.
As shown in Fig. 4(b), Eq. ( 3) provides a qualitatively good fitting to MD-computed TPDC at all values of r id with a single set of parameters.This is remarkable, given the simplicity of this model.With a careful examination of Fig. 4(c), one can see some quantitative difference between the fit and the MD-derived results, particularly at the second oscillation peak around 5 ps.This could result from the fact that the model applied here is extremely abstract without taking into account the complex chemical and structural heterogeneity inside the protein.Such heterogeneity can furnish a distribution of K and K 0 , which could lead to the quantitative difference observed.This might be improved if one replaces the harmonic-chain potential by one or a few low-frequency harmonic modes derived from the normal mode analysis [13,31].This is, however, beyond the scope of the present work.
Moreover, Eq. (3) predicts the long-time limit of TPDC as Substituting T = 120 K as well as best-fitting values for K 0 and K, we can calculate C(∞) as a function of r id .As shown in Fig. 4(c), the obtained results quantitatively agree with those directly derived from MD for r id between 1 and 15, further validating the toy model [Eq.( 3 , the characteristic distance for the vibrations to propagate along the peptide chain scales as K K 0 .Hence, the bonded interactions are responsible for propagation of the vibrations, while the nonbonded ones dissipate the modes.

VI. CONCLUSION
In this Rapid Communication, we have performed neutron scattering on fully deuterated cytochrome P450, and studied the interatomic vibrations on 1-10 meV.We found that these low-energy vibrations are correlated primarily through peptide bonds rather than nonbonded interactions.More importantly, we have shown that the spatiotemporal features of collective vibrations in the protein seen in the all-atom MD simulation can be successfully reproduced by a simple toy model of a confined and underdamped harmonic chain.Our results demonstrate that the low-energy vibrations propagate one dimensionally along the peptide chain, but are damped by the surrounding noncovalent environment.The one-dimensional behaviors of the low-energy vibrations discovered here in the protein are likely to be generally applicable to other biomacromolecules (DNA, RNA, lipids, etc.) or even ordinary linear polymers.Indeed, as shown by Ref. [32], the vibrations in DNA fibers can be interpreted as sound waves through a one-dimensional, monoparticle (basepair) chain along the axis of the DNA fiber.
Collective vibrations are believed to play important roles in the transportation of energy, perturbation, and allosteric large-scale conformational changes in biomacromolecules.Our findings provide crucial insight to a proper understanding of these processes.For example, when a protein molecule binds to a ligand or cofactor or is perturbed by other external sources, the resulting energy will be coupled to the low-energy vibrational modes to proceed through the peptide chain.The large force constant of the peptide bond can ensure a fast transportation of the perturbation, while the dissipation by nonbonded neighbors will slowly distribute the energy to the entire protein molecule.

FIG. 1 .
FIG. 1. Experimental dynamic structure factor (DSF) of CYP101.(a) A representative structure of CYP101.DSF of (b) D-CYP and (c) H-CYP measured at 120 K, where the results were summed up in the q range from 1 to 4 Å −1 to improve the statistics.As the present work primarily concerns about how vibrations propagate and dissipate inside the protein molecule, we focus the analysis of neutron spectra above 1 Å −1 to exclude the contribution of the interprotein scattering at low q to the neutron signals [see the discussion of inter-and intraprotein scattering and Figs.S2(a) and S2(b) in SM [22]].(d) The approximate static structure factor (SSF) of D-CYP by integrating DSF measured at 5 K over −10 to 10 meV, where peaks I and II are labeled.Comparison of the q dependence of DSF measured on D-CYP and H-CYP at (e) E 0 = 2.7 meV and (f) E 0 = 7.0 meV.The values of DSF measured on D-CYP were multiplied by a factor of 5 for a better comparison of the shape.The scattering intensities presented in (b)-(f) are relative values by normalizing to the values of a vanadium reference.

FIG. 3 .
FIG. 3. MD-derived C(t * ) with t = 1.5 ps.(a) Histogram of C(t * ) between pairs of residues connected, respectively, by a peptide bond, by a hydrogen bond (HB), and by ordinary van der Waals (vDW) interactions within a distance of 5.5 Å.(b) C(t * ) as function of spatial distance r.(c) C(t * ) as a function of "chemical distance" r id .To improve the statistics, an ensemble average is performed over all residue pairs with the same r and r id in (b) and (c), respectively.(d)Schematic diagram for coordination shells (dashed circles).Detailed algorithm for defining coordination shells is provided in SM[22].(e) C(t * ) as a function of shell numbers, for peptide-bonded neighbors (red) and nonbonded neighbors (green), respectively.

FIG. 4 .
FIG. 4. (a) Illustration of the toy model: One-dimensional harmonic chain confined and damped by neighboring residues.(b) C(t) as a function of t at fixed values of r id .Open symbols correspond to the values calculated from MD, while the dashed lines denote fits using Eq.(3) with a single set of parameters of K, K 0 , λ, and m.(c) The long-time plateau C(∞) derived from MD and from Eq. (3).The MD-derived C(∞) is approximated by averaging C(t) from 15 to 20 ps in (b).