New Dynamical Window onto the Landscape for Forced Protein Unfolding

The unfolding of a protein by the application of an external force pulling two atoms of the protein can be detected by atomic force and optical tweezers technologies as have been broadly demonstrated in the past decade. Variation of the applied force results in a modulation of the free-energy barrier to unfolding and thus, the rate of the process, which is often assumed to have single exponential kinetics. It has been recently shown that it is experimentally feasible, through the use of force clamps, to estimate the distribution of unfolding times for a population of proteins initially in the native state. In this Letter we show how the analysis of such distributions under a range of forces can provide unique information about the underlying free-energy surface such as the height of the free-energy barrier, the preexponential factor and the force dependence of the unfolding kinetics without resorting to ad hoc kinetic models.

Single molecule protein unfolding studies are shedding light on the free-energy landscape underlying the macroscopic properties of proteins.In particular, atomic force microscopy is providing a picture of the free-energy landscape at an unprecedented level of detail, revealing the presence of distinct states and parallel unfolding routes [1,2], the barriers and free-energies between states [3] and in some instances, the secondary structure elements that make up the structured core of unfolding intermediates [4,5].Traditionally, the analysis of force spectroscopy experiments have relied on a phenomenological model due to Bell [6,7], where the mean unfolding time and the maximum unfolding force is predicted to scale exponentially and logarithmically with the force and pulling speed, respectively.It has been recently shown [8][9][10] that even though Bell's formalism works well for average quantities (e.g., maximum force or mean unfolding time), it fails to reproduce the variance and the distribution of those quantities.Furthermore, Bell's model assumes that (i) the integration over all other degrees of freedom other than the reaction coordinate, x, induces a one-dimensional freeenergy profile, GðxÞ, that accurately describes the transition, (ii) the ''distance'' between the native state and the transition state along x is invariant with the force and (iii) the kinetics is single exponential.To this end, Dudko, Hummer, and Szabo [9] proposed an alternative formalism (DHS) that corrects for assumption (ii) and found that their theory is able to account for the variances in the maximum unfolding force at different pulling velocities.Intriguingly, in contrast to Bell, DHS also appears to allow the estimation of the naked barrier height without recourse to arbitrary values of the preexponential factor.However, like Bell, DHS also assume that conditions (i) and (iii) hold.
Below, we show that the distribution of unfolding times not only imposes more severe constraints on a ''model'' (e.g., Bell or DHS), but also contains unprecedented information regarding the free-energy landscape of the protein, amplifying the importance of such single molecule experiments.Remarkably, we observe that at high forces (i.e., low effective barrier), the distribution of unfolding times is not single exponential.An analogous deviation from the expected single exponential kinetics has been recently observed experimentally [11] although the explanation of the deviation from single exponential kinetics was quite different from the one we present below.In addition, by solving the diffusion equation for the probability density, one can fit the distribution of unfolding times and not only extract a model-free estimate of the preexponential factor, the activated time for the unfolding and the force dependence of the unfolding kinetics, but also an independent estimate of x u , which in the mechanical unfolding literature, is commonly taken to be the distance to the transition state.We note that this interpretation of x u is only valid in the one-dimensional case and that x u can be understood in more general terms in the context of a Taylor's expansion of the free-energy around the barrier (see below).
To generate the distributions, we used Langevin dynamics simulations with a coarse-grained (only C atoms are represented) native-centric protein model [12].While it is arguable whether such models are realistic enough to describe how a protein collapses and folds [13][14][15], they are accurate in predicting the mechanical properties of proteins [16,17].
Specifically, we study the mechanically resistant protein, ubiquitin, which has a well-defined transition state both in simulations [16,18] and experiments [19].We first characterized the protein by performing simulations over a broad range of temperatures.At 300 K, the native state is very stable and no unfolding event was recorded during a 1:25 s simulation.Along this simulation, 5000 phase points (coordinates and velocities) were extracted as initial conditions for constant force pulling simulations.Applying a force of 250 pN to the two ends of the polypeptide chain, ubiquitin unfolds in about 0.7 ns on average over 5000 independent simulations.(Unfolding is defined as the protein reaching an extension of 100 A ˚, which is about 40% the maximal length.Unfolding occurs in a highly cooperative manner at an extension of $42 # A even at the largest forces probed in this work and as such, the time distributions do not depend on the exact definition of an unfolding event).At large enough forces, the cumulative probability of unfolding times [Fig.1(a)] is clearly not single exponential, but can be well described by a double exponential.
A simple explanation of the nonexponentiality at high forces arises from assumption (iii) used in deriving onedimensional landscape models of protein unfolding (see above), which is equivalent to taking only the eigenfunction belonging to the lowest eigenvalue of the diffusion equation for the probability density Pðx; tÞ (where ¼ 1=k B T): This is not exact because there is no guarantee that the form of the first eigenfunction is identical to the equilibrium population of the denatured state, which itself forms the initial condition for the evolution of Eq. ( 1).The single exponential approximation becomes exact only in the limit of large barriers (i.e., ) 1k B T). Away from this limit, the next most important term in the series become nonnegligible.The structure of the first two terms of this series is [20,21] PðtÞ where 1 $ 0 e ÁG u ( 0 is the usual Kramers prefactor) and 2 and time scales of all higher terms are of the order of 0 (i.e., do not suffer the Boltzmann penalty).Higher-order terms, A n , in Eq. ( 2) arise from higher eigenfunctions of the Smoluchowski equation [Eq.( 1)] of index n ¼ 2; 3; 4; . . . .From the continuity of the density function, Pðx; tÞ, these terms have amplitudes A n ' 1=n 2 which results in the weakly perturbed region being dominated by A 2 .
Crucially, the amplitude of the first term (A 1 ) is generically dominating [22] by a factor proportional to the exponential of the effective barrier height, i.e., where is a number of order one that depends on the shape of the free-energy landscape GðxÞ.
In the presence of a force pulling the two ends of the protein apart, the height of the free-energy barrier becomes a decreasing function of the force.
The broadly used ''Bell'' expression constitutes the leading term only; here x u is the distance along the one-dimensional reaction coordinate to the freeenergy barrier and ÁG u ¼ ÁGðx u Þ.Even in the case where Eq. ( 4) holds (i.e., for ÁG u ! 1, F ! 0), when a large number of degrees of freedom are involved, the coefficient x u contains entropic contributions as well [23]; furthermore, high-dimensional folding landscapes strongly compromise the conclusions of one-dimensional analyses of folding times [22,24], making interpretations of x u as a physical distance perilous.However, the validity of Eq. ( 3) does not depend on the validity of Eq. ( 4).The latter can still be assumed in a small range of forces where ÁG u can be approximated as a linear function of F, which leads to unambiguous values of x u .
It should be noted that in Eq. ( 3) is computable if the form of the potential is known.For example, in the flat hypersphere case [22] treated by Bicout and Szabo, ¼ 9 2 =24 ' 3:7.If is independent of the force, the value of the coefficient, x u , extracted from Eq. ( 3) is independent of the exact value of .This is in strong contrast with models that attempt to extract x u from the force dependence of the mean unfolding time alone.We will show below that the assumption that is force independent is consistent with our simulations.
By performing the simulations in a broad range of forces, we evaluated the distribution of unfolding times at each force and extracted the 2 times, 1 and 2 , and the amplitudes, A 1 and A 2 .
From the double exponential fits of the distribution of unfolding times of our ubiquitin model, 2 is approximately constant with an average of 18.9 ps [Fig.2(a)].This value is lower than the $10 ns rate of looping in peptides [25] or the $1 s folding rate of the fastest proteins [26] that are often taken as rough estimations of the preexponential factor.There are two reasons for this: one is the low external friction at which the simulations have been performed (0:1 ps À1 $ 1000 times lower than water) and the other is that the ''internal friction'' of the model is lower than that of real proteins due to the smoother free-energy surface of a structure-based model designed to be ''minimally frustrated.''Decreasing the applied force from 250 to 200 pN results in the average unfolding time increasing to 8.5 ns and the distribution of unfolding times becoming single exponential [Fig.1(b)]: the effective barrier becomes high enough that the kinetics are indistinguishable from single exponential.In this case, the amplitude of the second term A 2 [Fig.2(b)] decreases to zero at low forces as expected.
The logarithm of A 1 =A 2 [Fig.2(c)] is perfectly linear, compatible with the assumption that and x u are independent of the force.Fitting the ratio of the two amplitudes to Eq. ( 3), the slope of lnðA 1 =A 2 Þ uniquely determines x u ¼ 1:5 # A, which is independent on the shape of the underlying free-energy profile.On the other hand, ÁG u and are related by ÁG u þ ¼ 10:9: if is that of a Bicout-Szabo barrier ( ' 3:7), we obtain ÁG u ¼ 5:8 kcal=mol; if varies by 1 order of magnitude, the estimation of ÁG u only varies by 1:4 kcal=mol.
To compare the values of ÁG u , x u and 2 obtained with Eq. ( 2) and (3), we independently determined the parameters by fitting 1 to the DHS equation [9].
where ¼ 1=2 corresponds to the cusp case, ¼ 2=3 to the linear-cubic case and the Bell form is recovered for either ¼ 1 or ÁG u ! 1.The fits of the 1 using the Eq. ( 5) with both ¼ 1=2 and 2=3 are shown in Fig. 2(d).
From the fit of the the activated time, 1 , we obtained x u ¼ 2:3 # A ( ¼ 1), x u ¼ 3:9 # A ( ¼ 2=3), x u ¼ 4:6 # A ( ¼ 1=2), and ÁG u ¼ 14:8 kcal=mol ( ¼ 2=3) and ÁG u ¼ 12:8 kcal=mol ( ¼ 1=2).These values of x u and ÁG u however, are those for zero force and cannot be directly compared with those obtained from Eqs. ( 3) and ( 4).Given a linear-cubic free-energy profile ( ¼ 2=3) with ÁG u ¼ 14:8 kcal=mol and x u ¼ 3:9 # A as estimated above, ÁG u and x u drop to 10-13 kcal=mol and 1.9-2.7A ˚, respectively, when the applied force is 200-300 pN.Such values are slightly larger than those obtained using our ''modelfree'' approach in the same range of forces.However, the values estimated from DHS depend on the assumed one dimensionality and shape of the free-energy profile (which is a central result of Ref. [9]), while those from the analysis of the distributions of unfolding times do not.Indeed, we have checked by performing Brownian dynamics on a onedimensional linear-cubic free-energy profile that in the range of forces where our approach [i.e., Eqs.(3)] is applicable (i.e., the kinetics are double exponential), it gives estimates of ÁG u and x u identical to directly fitting ÁG u of the linear-cubic profile to Eq. (4).Therefore, both the atomistic model of ubiquitin and 1D Brownian dynamics simulations corroborate the validity of Eq. (3).The similarity of the estimates of ÁG u and x u to the ''true'' values at zero force however, depends on the form of ÁG u , which for real proteins, is not only likely to show significant deviations from the simple shapes analyzed here (i.e., linear cubic), but is also likely to be multidimensional.
In conclusion, we demonstrate that the distribution of unfolding times in a range of forces provides crucial information not obtainable from average times: an independent estimate of both the preexponential factor (i.e., 2 ) and x u .Remarkably, both the preexponential factor 0 ' 2 and x u obtained from the analysis of the time distributions do not depend on a particular model (i.e., a form for the underlying free-energy profile) unlike all previously proposed models.It should be noted that it is not possible to obtain an estimate of the preexponential factor or ÁG with the broadly used Bell model.Conversely, the determination of ÁG u will depend on the value of , which is model dependent; however, the dependence on the model is much weaker than that observed within the DHS framework.Lastly, we show that a deviation from single exponential kinetics can be simply explained by a low barrier, which is a well-known result (see, e.g., Ref. [27]).Thus, as fast-improving atomic force instruments make the deter-

FIG. 1 (
FIG. 1 (color online).Cumulative probability of unfolding times at two different forces and 300 K. (a) A double exponential is required to fit the curve at F ¼ 250 pN.(b) At a lower force (F ¼ 200 pN) the free-energy barrier is larger and the distribution of unfolding times is closer to single exponential.

FIG. 2 (
FIG. 2 (color online).(a)Average unfolding time hti and fitting parameters from the double-exponential fit of the time distribution 1 and 2 ( 1 þ 2 ' hti).(b) Amplitude A 2 of the leading term of the expansion in Eq. (2); A 2 tends to 0 (i.e., PðtÞ can be approximated with a single exponential) for F 200 pN.(c) Fit of A 1 =A 2 to Eq. (3).(d) Activated time 1 fitted using Bell's relation (black line), DHS with ¼ 1=2 (red line) and ¼ 2=3 (blue line); data points for F > 300 pN have been disregarded in the fits.