Large deviations of the free energy in the p-spin glass spherical model

We investigate the behavior of the rare fluctuations of the free energy in the p-spin spherical model, evaluating the corresponding rate function via the G\"artner-Ellis theorem. This approach requires the knowledge of the analytic continuation of the disorder-averaged replicated partition function to arbitrary real number of replicas. In zero external magnetic field, we show via a one-step replica symmetry breaking (1RSB) calculation that the rate function is infinite for fluctuations of the free energy above its typical value, corresponding to an anomalous, super-extensive suppression of rare fluctuations. We extend this calculation to non-zero magnetic field, showing that in this case this very large deviation disappears and we try to motivate this finding in light of a geometrical interpretation of the scaled cumulant generating function.

The theory of disordered systems has been mainly developed to describe the typical behavior of physical observables. However, as it has been argued since the early days of the subject, one can employ spin glass techniques in a more general setting, to estimate probability distributions [1] and fluctuations around the typical values [2,3] of quantities of interest. More recently, Rivoire [4], Parisi and Rizzo [5][6][7][8] and others [9][10][11] followed this line of thought, providing a bridge between spin glasses (and disordered systems more in general, as in [12]) and the theory of large deviations, that deals with rare events whose probability decays exponentially in the system size. This topic, which is the natural framework to set statistical mechanics in a mathematical perspective, has recently been the subject of a comprehensive and pedagogical review by Touchette [13], as well as of intensive efforts in non-equilibrium statistical physics [14].
The key quantity providing the bridge is of course very familiar to spin glass physicists and is given by: where Z N is the partition function for a system of size N and · · · is the average over disorder. The argument of the logarithm is the averaged replicated partition function and k is the so-called replica index. From the viewpoint of large deviation theory, G(k) is simply related to the scaled cumulant generating function (SCGF) of the * mauro.pastore@unimi.it † andrea.dgioacchino@gmail.com ‡ pietrorotondo86@gmail.com free energy f = lim N →∞ f N by If the SCGF is differentiable one can show that the probability P (f ) describing the fluctuations of the free energy satisfies a large deviation principle: and the rate function I(x) is given by the Legendre transform of the SCGF: This is an application of a standard result of large deviation theory known as Gärtner-Ellis theorem and describes the rare fluctuations of the free-energy around the typical value f typ , which corresponds to the special point of the rate function I(f typ ) = 0. From the disordered systems perspective, most of the standard results of spin glass theory obtained within the replica method concern only the very special limit k → 0, since f typ = f = ψ (0), whereas to obtain the full form of I(x) that describes arbitrary rare fluctuations of the free-energy one needs to work out the SCGF for finite replica index k. This problem is clearly equivalent to determine the full analytical continuation of the averaged replicated partition function from integer to real number of replicas k and it was extensively investigated in the early stage of the research in disordered systems in order to understand the manifestation of the (at that time surprising) mechanism of replica symmetry breaking [15]. Since these results are particularly interesting from the more modern large deviation viewpoint, we briefly mention the main ones in the following.
Van Hemmen and Palmer [16] were the first ones to observe that the expression in Eq. (1) must be a convex function of the replica index k, as can be proven by exploiting Hölder's inequality. Shortly later, Rammal [17] added that ψ(k)/k must be monotonic, which is actually a necessary condition for the convexity of ψ(k). However the replica symmetric (RS) ansatz, which provides the most obvious analytical continuation to real k of the replicated partition function, gives often a trial SCGF which is not convex, or such that ψ(k)/k is not monotonic. This problem has been analyzed for the first time in the context of the Sherrington-Kirkpatrick model. After Parisi introduced his remarkable hierarchical scheme for replica symmetry breaking, Kondor [18] argued that his full RSB solution was very likely to provide a good analytical continuation of Eq. (1), not only around k = 0.
These results may be considered nowadays as the initial stage of a work that attempted to give mathematical soundness to the replica method. Although this vaste program is mostly unfinished, Parisi and Rizzo realized that the original analysis presented by Kondor is fundamental to investigate the large deviations of the freeenergy in the SK model. Large deviations have been examined only for a few other spin glass models: Gardner and Derrida discussed the form of the SCGF in the random energy model (REM) in a seminal paper [19], and many rigorous results have been established later on [20]; Ogure and Kabashima [21][22][23] considered analyticity with respect to the replica number in more general REM-like models; Nakajima and Hukushima investigated the p-body SK model [10] and dilute finiteconnectivity spin glasses [11] to specifically address the form of the SCGF for models where one-step replica symmetry breaking (1RSB) is exact.
In this manuscript we add one more concrete example to this list, considering the p-spin spherical model [24]. In zero external magnetic field, we show that the 1RSB calculation at finite k produces a SCGF with a linear behavior below a certain value k c ; a nice geometrical interpretation of this, dating back to Kondor's work on the SK model [18], is discussed. Accordingly, the rate function is infinite for fluctuations of the free energy above its typical value, which are then more than exponentially suppressed in N with respect to the standard case described by Eq. (3). This property, which is commonly described stating that the free energy has a "very-large" deviation behavior for positive fluctuations, is present in several other spin glass problems, as discussed for example in [7], and, more generally, in other systems showing extreme value statistics [9]. In some of the early literature [25], this feature is also called "overfrustration".
The situation changes dramatically when a small external magnetic field is applied: the rate function is finite everywhere, although highly asymmetric around the typical value, and so the very-large deviation feature disappears. We explain intuitively the reason of this change of regime in light of the geometrical interpretation discussed for the case without magnetic field, and argue that the introduction of a magnetic field could act as procedure to regularize the anomalous scaling of the large deviation principle for this kind of systems. The manuscript is organized as follows: in Section II we derive the SCGF for the p-spin spherical model without magnetic field; then we employ the Gärtner-Ellis theorem to compute the corresponding rate function in the high-and low-temperature phases. In Section III we generalize the results to non-zero magnetic field and compare the SCGF and the rate function obtained with those of the previous case. In Section IV we summarize our results and discuss possible future directions. Finally, in Appendix A we discuss the details of the geometrical interpretation of the 1RSB ansatz.

II. LARGE DEVIATIONS OF THE p-SPIN SPHERICAL MODEL FREE ENERGY
The p-spin glass spherical model consists of a p-body interaction of N continuous spins with the following Hamiltonian: where the J-couplings are independent quenched random variables normally distributed with zero mean and variance while the spins are real variables with range in (−∞, ∞) subject to a global spherical constraint such that the measure is These scalings guarantee the extensivity of the free energy. Its exact typical value is obtained within the replica formalism and a 1RSB ansatz, as done in the seminal work [24] by Crisanti and Sommers (CS).
In the following we will analyze the large deviations of the free energy of this model. For sake of brevity, we will not reproduce all the steps of CS, whose analysis we will extend here to any real finite number of replicas.

A. From replicas to the scaled cumulant generating function
We start our analysis from Eq. (3.16) of [24] with null magnetic field. Accordingly, the partition function is (up to finite-size corrections in N ): where and s(∞) = [1 + log(2π)]/2 is the entropy density in the infinite temperature limit. To evaluate the integrals on q αβ we use the saddle point method together with the 1RSB ansatz, which is formulated in terms of the three parameters (q 1 , q 0 , m): with αβ defined as The eigenvalues of q, with the respective degeneracies, are (12) Using this and inserting the ansatz (10) in (9) we find This functional is evaluated numerically at the saddle point (q 1 ,q 0 ,m) for the 1RSB parameters for each value of k. The three parameters take values in the domains (otherwise), and for k < 1 the saddle point is obtained with a maximization of the functional instead of a minimization, as usual in replica theory. Using Eq. (2), we obtain a SCGF ψ(k) which becomes linear above a certain value k = k c , depending on temperature. To ease the visualization of this feature, in Fig. 1 we plot the function G(k)/k = g(k;q 1 ,q 0 ,m)/(kβ) which, when ψ(k) is linear, intersects the vertical axis in f typ . The figure does not change qualitatively for p ≥ 3.
The p = 2 case at low temperature is different: the 1RSB ansatz reduces to the RS one (that is,q 1 =q 0 ) as long as k ≥ 0, therefore the typical values of all the thermodynamic quantities are obtained under the RS ansatz [26]. On the opposite, for k < 0 we need to introduce again the 1RSB ansatz which, as in the p ≥ 3 case, gives the linear behavior of the SCGF. In other words, k c = 0 for the 2-spin spherical model for all β > β c .
Before turning to the evaluation of the rate function, we discuss an interesting geometrical interpretation of the SCGF shape. To this aim, let us consider the RS ansatz (that is, Eq. (13) with q 1 = q 0 = q and m = 1). As we can see in Fig. 1, the RS solution (blue curve) is nonmonotonic for β > β c . On the other hand, one can prove that G(k)/k has to be a monotonic quantity, therefore the RS solution can be ruled out. We can check that the 1RSB solution gives a perfectly fine monotonic G(k)/k (red curve in Fig. 1), as one could expect due to the fact that this ansatz gives the correct typical free energy for this model. Interestingly, however, exactly the same monotonic curve can be obtained by using a much simpler geometric construction: just consider the RS solution, which is the right one for large k, and when G(k)/k starts to be non-monotonic continue with a straight horizontal line (in the G(k)/k vs k plot). This construction actually dates back to Rammal [17] and is discussed in more detail in Appendix A. Here we limit ourselves to notice that G(k)/k obtained by using the 1RSB ansatz or the Rammal construction are the same because of the following facts: (i) for k > k c the 1RSB and RS ansatzë coincide (q 1 =q 0 = q = 0) and k c is exactly the point where G(k)/k is not monotonic anymore if one uses the RS ansatz; (ii) from the saddle point equations obtained by extremizing Eq. (13) when k < k c , one obtainsq 0 = 0; (iii) the remaining saddle point equations fix q 1 and m, and one can see that these equations are identical to those needed to perform the Rammal construction, which fix the point k c and the parameter of the RS ansatz q.

B. Rate function and very large deviations
Starting from the SCGF evaluated in the last section, we perform a numerical Legendre transformation to obtain the rate function according to Eq. (4). The result is shown in Fig. 2 for different values of β. The rate function displays the following behavior: • for x = f typ , it is null as expected; • for x < f typ , I(x) is finite, indicating that a regular large deviation principle holds for fluctuations below the typical value. When β > β c the SCGF is smooth, so we obtain the rate function via the Gartner-Ellis theorem. On the other hand, when β < β c the SCGF is not differentiable in a point (see Fig. 1), so we are only able to obtain the convex hull of the rate function (see Fig. 2); • for x > f typ , I(x) = +∞. This is due to the linear behavior of the SCGF below k c discussed in the previous section and it is a signature of an anomalous scaling with N of the rare fluctuations above the typical value.
An ambitious goal would be the identification of the correct behavior with N of these very large deviations. Indeed, a more general way of stating a large deviation principle is with b N /N → ∞. For this reason, fluctuations above the typical value are referred to as "very large deviations". The physical explanation of the substantial difference in scaling of the deviations of thermodynamical quantities below and above their typical values resides in the different number of elementary degrees of freedom involved to obtain the corresponding fluctuation: while in the first case it is sufficient that only one of the elementary variables assumes an anomalous value below its typical, the others being fixed, in the second case all the variables have to fluctuate, a joint event with probability heavily suppressed with respect to the first one. This argument shows the importance of the resolution of the anomalous scaling behavior leading to the very large deviations we explained above. In general, however, although the Gärtner-Ellis theorem can be extended to find rate functions for large deviation principles with arbitrary speed a N , b N , we lack techniques to compute the asymptotic scaling of a N and b N for large N , because of additional inputs needed to calculate the corresponding SCGF with a saddle-point approximation (for some other systems this problem has been solved with ad-hoc methods [9,27], while in [7] a method is proposed in the context of the SK model).
In the next section we present the main result of our work, which could be useful to study this anomalous kind of fluctuations also in other problems: through an extension of the replica calculation to the case with an external magnetic field, we are able to numerically check that the very large deviation effect disappears. More in detail, we obtain that with a magnetic field, no matter how small, not only a N ∼ N as before, but also b N ∼ N .

III. LARGE DEVIATIONS OF THE p-SPIN MODEL IN A MAGNETIC FIELD
In this section we generalize the previous discussion to the case of non-zero magnetic field. The Hamiltonian for the model is where H p is given in (5) and h represents an external magnetic field coupled with the spins. The computation of the SCGF at h = 0 goes beyond the approach of the work by Crisanti and Sommers, who only considered the typical case. In contrast to the problem with h = 0, where the finite-k calculation consists of a quite straightforward generalization of the standard one, here a more substantial effort is needed to extend the k = 0 result. The starting point is Eq. (3.8) of [24], which we report here for convenience: where the entries of the λ matrix are auxiliary variables enforcing the constraints defining the overlap matrix and the spherical constraint in (7). In the presence of a magnetic field, the saddle-point integration in the λvariables is not straightforward as to obtain (13). The full expression of g(q, λ) before the λ-integration reads: Derivation with respect to λ αβ leads to the following saddle-point equations: where we have used the identity: Equations (20) are solved via successive contractions of the replica indices: a double summation over α, β leads to an equation for the scalar αβ λ −1 αβ with solutions: Similarly, a single contraction gives: and finally Given the 1RSB ansatz (10), q αβ has k elements 1 on the diagonal, m(m − 1)k/m elements q 1 in the internal blocks, the remaining k 2 − k − k(m − 1) elements q 0 , so Every row (column) contains the same elements, so (26) To find which of the parameters l ± in Eq. (24) is the right one, we can perform the limit k → 0: so that λ has a finite limit only with l − , for which the saddle-point equations become wherê The structure is the same as the one of q αβ , with a constant added to each entry. Thus, the entries of λ −1 can be written as (31) It is also easy to see, inverting a matrix with a 1RSB structure, that and that λ has eigenvalues The next step is to evaluate the trace appearing in (19): Using all these ingredients, we can write the functional g(q) in the 1RSB ansatz for finite k: As in the previous section, we numerically obtain and plot, in Fig. 3, G(k)/k = g(k;q 1 ,q 0 ,m)/(kβ), where againq 1 ,q 0 ,m are the solutions of the saddle point equations, obtained by extremization of Eq. (35). The most striking feature of these plots is the difference from those represented in Fig. 1: the linear behavior is replaced by curves (again given by the 1RSB ansatz) with non-null derivative. Let us analyze more closely what is going on and why the external magnetic field is modifying the behavior of the system. As discussed in the last part of Sec. II, one can apply the Rammal construction to correct the non-monotonic behavior of the RS version of G(k)/k (plotted as a blue curve in Fig. 3). Exactly as in the h = 0 case, the resulting function will be monotonic and linear, which is the smooth continuation of G(k)/k from k m , the point where it loses its monotonicity. However, as one can see from Fig. 3), the result will not be the 1RSB solution. This difference from the h = 0 case can be seen as a consequence of the saddle point equations: now the equation for q 0 is non-trivial and so eitherq 0 ,q 1 andm depends on k also in the 1RSB phase, giving rise to the non-constant behavior of G(k)/k also for k < k c . It is worth mentioning another point: when h = 0, the critical point k c where the 1RSB solution departs from the RS one, coincides with k m , the point where G(k)/k obtained by the RS ansatz loses its monotonicity. Differently, with h = 0, we have that k c > k m for β > β c , so that the 1RSB branch departs from the RS one above k m . Finally, we numerically checked that the shape of G(k)/k below k c depends on p.
This change in the SCGF has an important effect, in turn, on the rate function: performing the numerical Legendre transformation of the SCGF we now obtain a continuous curve, meaning that very rare fluctuations are washed out, see Fig. 4. In other words, now the two quantities a N and b N introduced in Eq. (14) are such that a N ∼ N and b N ∼ N . This effect is present also for very small magnetic field, even though the rate function is more and more asymmetrical around x = f typ as we decrease h.  Fig. 2 is replaced by a curve gradually less steep as the magnetic field is increased.

IV. DISCUSSION
In this manuscript we analyzed the behavior of the large (and very large) deviations of the free energy for the spherical p-spin model, exploiting the Gärtner-Ellis theorem to obtain the rate function. Without external magnetic field, we are able to compute the rate function in the spin-glass phase, while in the paramagnetic phase we obtain its convex hull, due to the non-differentiability of the SCGF. As a result, we have a standard large deviation principle for fluctuations below the typical value of the free energy, that is they are depressed exponentially in the size of the system. On the other hand, fluctuations above the typical value have a different behavior, being suppressed more than exponentially, and the corresponding rate function is infinite. When a magnetic field is applied, this anomalous very large deviation disappears and the rate function is finite everywhere. Since this remains true even if the field is very small, an open question is whether this effect can be exploited to obtain insights on the very large fluctuations, by sending the magnetic field to zero carefully choosing its dependence on the system size.
In addition, we provided a geometrical interpretation to support our numerical findings. Indeed we showed, as noticed previously in the literature for different models, that for h = 0 the Rammal construction is equivalent to the 1RSB ansatz. However, we also showed that this is due to the simple structure of the 1RSB ansatz without external magnetic field, where one can immediately fix one of the 1RSB parameters. When a magnetic field is applied, all the parameters have non-trivial values (which we obtained numerically by solving the saddle point equations) and the Rammal construction, which gives in turn the infinite-rate-function behavior, fails. Another interesting question is whether it is possible to generalize the geometrical construction by Rammal to correct in the right way the RS solution not only for h = 0, but also when h = 0.

ACKNOWLEDGMENTS
The authors would like to thank Enrico Malatesta and Sergio Caracciolo for the useful discussions and suggestions.
Appendix A: Rammal construction In this appendix we report the details of the geometrical construction reproducing the solution for the SCGF obtained with a 1RSB ansatz with q 0 = 0. The following observations are traced back to Rammal's work [17] and can be found in [18] (similar considerations in [10,11,21]). We reproduce here the reasoning not only as an historical curiosity: first of all, we see it as an enlightening approach to the problem of the continuation of the replicated partition function to real number of replicas, particularly suitable for a finite k analysis. Moreover, we note that this interpretation, whenever it works, gives a flavor of "uniqueness" (though not in a strict mathematical sense) to the resulting solution, being based only on the properties of convexity and extremality that the function ψ(k) must have. In this respect, a generalization of this result would be of great interest in order to better understand the necessity of Parisi hierarchical RSB procedure, which has been dubbed as "magic" even in relatively recent works, like [28]; however, a true geometrical interpretation of the full machinery of RSB, beyond the simple case considered here, still lacks. Finally, in the context of this paper we are able to show a case where the construction gives the correct answer (the p-spin spherical model at zero external magnetic field) and a case where it fails (when the field is switched on).
Some important properties of the function (2) can be derived in full generality using its definition only. Applying the Hölder inequality to the probability measure over the disorder with X, Y some observables, it is easy to prove that ψ(k) must be a convex function of k (using X = e αk1N f N , Y = e (1−α)k2N f N in the formula above, then taking the log and the large N limit), and that ψ(k)/k must be monotonic (using now X = e kk1N A N , Y = 1).
Given that, the explicit evaluation is performed for each system within replica theory: an ansatz is imposed on the form of the replica overlap matrix, the number of replicas k is then continued from integer to real values, the corresponding G(k) is evaluated with the saddlepoint method for large N and finally a check is performed a posteriori to verify its validity. In the SK model, the system originally considered by Rammal, at low temperatures the replica symmetric ansatz, which still gives the correct values of the positive integer momenta of the partition function, fails to produce a sensible solution for the SCGF at k < 1, in at least three way: • it becomes unstable under variations around the saddle point (de Almeida-Thouless instability [29]) below k = k dAT ; • it produces a G(k) that is non-concave (and so a non-convex ψ(k)) around k = k conv , meaning that G (k) changes sign at k conv ; • it produces a G(k)/k that loses monotonicity a k = k m .
In the SK model k dAT is the largest (k dAT > k m > k conv ), and so it is the first problem one encounters in extrapolating the RS solution from integer values of k. However, from the point of view of convexity and monotonicity alone, Rammal proposed to build a marginally monotone G(k)/k in a minimal way, starting from the RS and simply keeping it constant below k m at the value G(k m )/k m . While the resulting function is not the correct one for the SK model, which needs a full RSB analysis to be solved, surprisingly enough for the spherical p-spin in zero magnetic field this approach reproduces the solution obtained with a 1RSB ansatz with q 0 = 0 (see Fig. 1). Notice that in the present model the RS solution suffers from the same inconsistencies as in the SK model, but now k m is the largest of the three problematic points.
To convince the reader that the two approaches are actually equivalent we prove, as final part of this appendix, that without an external magnetic field the 1RSB solution of the spherical p-spin and the Rammal construction coincide. In order to obtain this result, we have to prove that: • the 1RSB solution for G(k)/k becomes a constant below k = k c , which is defined as the point where the RS and 1RSB ansätze branch out, as we did in the main text; • this constant is the same as the one in the Rammal construction, that is G(k m )/k m ; • the points k c and k m are the same.
As k c is the point where the RS solution is not optimal anymore, for k < k c we haveq 0 = 0, as discussed in [24]. Let us now consider Eq. (13) with q 0 = 0: differentiating with respect to q 1 and m and setting the results equal to 0 we get the equations forq 1 andm, which read where µ = p(βJ) 2 /2. These equations can be solved numerically (as we did to obtain the plots in the main text), but to show our point here we do not really need the explicit solution. Indeed it is enough to notice thatm andq 1 do not depend on k and therefore g(k; 0,q 1 ,m)/k is a constant. Then, we need to check that it is the same constant as the one obtained by Rammal. Again starting from Eq. (13), by putting q 1 = q 0 = q we obtain the RS solution, which is In this case, extremizing with respect to q, we have an equation which gives the RS solution on the saddle point, q. To find k m , we then require ∂ ∂k g 0 /k = 0. The two resulting equations are: (A4) that are exactly Eqs. (A2) with k m instead ofm andq instead ofq 1 . Therefore k m =m andq =q 1 and one can check that g(k; 0,q, k m ) k = g 0 (k m , q) k m .
It only remains to prove that k c and k m , which in general can be different points, are actually the same. As the 1RSB ansatz gives the correct solution for the present model, the corresponding SCGF must be convex and thus, in particular, continuous. The only way to obtain a continuous function which is equal to the RS one above k c and to the Rammal's constant below, is to take k c = k m , and so the two functions coincide everywhere.