QCD thermodynamics from lattice calculations with non-equilibrium methods: The SU(3) equation of state

A precise lattice determination of the equation of state in SU(3) Yang-Mills theory is carried out by means of a simulation algorithm, based on Jarzynski's theorem, that allows one to compute physical quantities in thermodynamic equilibrium, by driving the field configurations of the system out of equilibrium. The physical results and the computational efficiency of the algorithm are compared with other state-of-the-art lattice calculations, and the extension to full QCD with dynamical fermions and to other observables is discussed.


Introduction and motivation
The phenomenology of the strong interaction at high temperatures and/or densities remains one of the most interesting (yet somehow elusive) research areas in the physics of elementary particles. As nicely summarized by B. Müller in his lecture at the 2013 Nobel Symposium on LHC Physics [1], the novel state of matter produced in nuclear collisions at LHC and RHIC reveals unique features: it is strongly coupled, but highly relativistic; at high temperature it displays the distinctive collective phenomena of a liquid, whereas at low temperatures it turns into a gas of weakly interacting hadrons; while its shear viscosity η is nearly 18 orders of magnitude larger than the one measured for superfluid helium and even 26 orders of magnitude larger than the one of ultracold atoms [2], the ratio of the shear viscosity over the entropy density s is actually lower than for those substances, and close to the fundamental quantum-mechanical bound 1/(4π) [3]; moreover, it thermalizes in a very short time, close to the limits imposed by causality. Finally, the quark-gluon plasma (QGP) is not simply a "rearrangement" of ordinary nuclear matter: rather, it "creates" its own ground state, in which two characterizing features of the hadronic world, color confinement and dynamical chiral-symmetry breaking, are lost.
At the temperatures reached in present heavy-ion-collision experiments-which, when expressed in natural units = c = k B = 1, are of the order of hundreds of MeV [4]-the QGP is strongly coupled: this demands a theoretical investigation by non-perturbative tools, and the regularization of quantum chromodynamics (QCD) on a Euclidean lattice [5] is the tool of choice for this purpose. Over the past few years, several physical observables relevant for finite-temperature QCD have been studied on the lattice (see refs. [6] for reviews): one of the most prominent among them is the QCD equation of state [7], which determines the evolution of the Universe shortly after the Big Bang, as well as the evolution of the matter produced in the "little bang" at ultrarelativistic nuclear colliders.
While state-of-the-art results for the QCD equation of state, obtained by different collaborations using slightly different types of lattice discretizations, are now consistent with each other, it is worth remarking that such computations still require large computational power, and the multiple extrapolations to the physical limit are far from trivial. For example, in the standard "integral" method [8], the fact that quantum fluctuations at the lattice cutoff scale induce a strong ultraviolet divergence in the free energy associated with the QCD partition function, implies that bulk quantities at thermal equilibrium, such as the pressure p at a finite temperature T , have to be extracted by subtracting the corresponding quantities evaluated in vacuum, and are encoded in numbers that scale like O(a D ) (a being the lattice spacing, and D the Euclidean spacetime dimension, i.e. four): this constrains the values of a that can be probed in these simulations and, as a consequence, the control over systematic uncertainties affecting the extrapolation to the continuum. Similarly, in simulations with staggered fermions, residual taste-symmetry-breaking effects can have an impact on the extrapolation of the quark masses to the physical limit.
Due to these challenges, in the past few years there has been renovated interest in alternative methods to compute the equation of state. In particular, we would like to mention two recent studies, based upon the gradient flow [9] (see also ref. [10], which reported the first calculation of thermodynamic quantities using this method, and the very recent ref. [11], for an application in SU(2) Yang-Mills theory) and on the formulation of the theory in a moving reference frame [12]: both of them have been successfully tested in SU(3) Yang-Mills theory without quarks, and can be extended to full QCD without major obstructions [13]. The thermal properties of a purely gluonic theory, albeit not relevant for a quantitative comparison with experiments, can reveal important universal features, shared by theories with different gauge symmetry [9,10,12,[14][15][16][17][18][19] and/or in different dimensions [20], and, by virtue of the limited computational power required for their numerical Monte Carlo simulation, provide a useful benchmark for new algorithms.
In this manuscript, we present yet another method to compute the SU(3) equation of state, which is based on Jarzynski's theorem [21,22]: as will be discussed in detail in section 2, this theorem encodes an exact relation between the ratio of the partition functions associated with two different ensembles (which, in this case, are defined as those of the theory at two different temperatures) to an exponential average of the work done on the system during a non-equilibrium transformation driving it from one ensemble to the other. As will be discussed in detail below, calculations of the pressure based on this technique still require the subtraction of ultraviolet vacuum contributions, as with the integral method; however, they strongly reduce the computational costs associated with thermalization, since, in contrast to the integral method, only the field configurations at the first temperature in each trajectory need to be thermalized. Jarzynski's theorem is closely related to a set of powerful mathematical identities in non-equilibrium statistical physics, which have been developed since the 1990's [23]. A first example of application of Jarzynski's theorem in numerical simulations of lattice gauge theory was presented in ref. [24], but the technique is quite general and versatile, and can be used for a variety of different lattice QCD problems (at zero or at finite temperature). In section 3, after laying out the setup of our numerical calculations, we report a set of high-precision results for the SU(3) equation of state obtained using this method, along with a detailed discussion of the underlying physics, and with a comparison to studies based on different methods [9,12,17]. Section 4 is devoted to a discussion of the computational efficiency of our method and to some concluding remarks. A summary of this work has been reported in ref. [25].

Jarzynski's equality
In this section, after stating Jarzynski's theorem, we first demonstrate it in a Hamiltonianevolution framework, following ref. [21], in subsection 2.1. Then, in subsection 2.2, we present a different derivation [22], based on a master-equation formalism, which is more directly relevant for a practical implementation in Monte Carlo calculations.
Jarzynski's equality [21,22] is a theorem in statistical mechanics, that relates equilibrium and non-equilibrium quantities.
Consider a classical statistical system, which depends on a set of parameters λ (defined in a space Λ), and let H denote its Hamiltonian, which is a function of the degrees of freedom φ. When the system is in thermal equilibrium at temperature T , the partition function, defined as (where {φ} denotes the sum over all possible φ configurations, and, depending on the nature of φ and on the theory, may be a finite or an infinite sum, a multiple integral, or a suitably defined functional integral), is related to the Helmholtz free energy F via Z = exp(−F/T ). In eq. (1), both the partition function and the free energy, like H, are functions of λ. Let λ in and λ fin denote two distinct values of λ in parameter space, and let Z λ in and Z λ fin denote the partition functions of the system in thermodynamic equilibrium, when its parameters take values λ = λ in and λ = λ fin , respectively. For a given physical observable O, let O λ denote the statistical average of O in thermal equilibrium in the ensemble with parameters fixed to λ. Consider now the situation in which the parameters of the system are varied as a function of time t in a certain interval (which can be either finite or infinite) of extrema t in and t fin , according to some, arbitrary but well-specified, function λ(t) (or "protocol" for the parameter evolution), with λ(t in ) = λ in and λ(t fin ) = λ fin . Assume that, starting from an initial equilibrium configuration at t = t in , the parameters are let evolve in time, according to the λ(t) function; accordingly, the dynamical variables φ respond to the variation in the λ parameters, and themselves evolve in time, spanning a trajectory in the field-configuration space. In general, the configurations at all t > t in are not thermalized, i.e. the λ(t) parameter evolution drives the system out of equilibrium (except when t fin − t in is infinite, so that the switching process is infinitely slow). Let W denote the total work done on the system during its evolution from t in to t fin ; since the system is driven out of equilibrium, the mean value of the work W obtained by averaging over an ensemble of such transformations, is in general larger than or equal to the free-energy difference ∆F = F λ fin − F λ in of equilibrium ensembles with parameters λ = λ in and λ = λ fin : Note that W − ∆F is the amount of work dissipated during the parameter switch, which is directly related to the entropy variation, hence the inequality (2) is nothing but an expression of the second law of thermodynamics. Also, when the parameter switch is infinitely slow (i.e. for ∆t = t fin − t in → ∞) the system remains in thermodynamic equilibrium throughout the switching process, the transformation is reversible, and the equality sign holds. However, if one considers the exponential average of the work, then it is possible to prove that it is directly related to ∆F through the following equality: Eq. (3) is the main statement of Jarzynski's theorem [21]. Before discussing the proof of eq. (3) for generic ∆t, we observe that when ∆t → ∞, the equality holds: in this limit, the parameter switch from λ in to λ fin is infinitely slow, the transformation becomes quasi-static, the system remains in equilibrium for the whole duration of the process, so that the work done on the system is equal to for every trajectory interpolating between the initial and final ensembles. Hence, in this limit one has W = W . Moreover, in this limit one also has W = ∆F , thus the left-hand side of eq. (3) can be written as and eq. (3) is trivially recovered.

Derivation in a Hamiltonian-evolution framework
To prove eq. (3) for finite ∆t, let us first consider the case in which the system is initially in thermal equilibrium with a heat reservoir at temperature T , but is isolated from it during the switching process from t in to t fin . Then, one can express the average over the ensemble of trajectories appearing on the left-hand side of eq. (3) in terms of the time-dependent probability density in the space of configurations, that we denote as ρ = ρ(φ, t). Given that at t = t in the system is in thermal equilibrium at temperature T , ρ satisfies the initial condition ρ(φ, t in ) = exp −H λ(t in ) (φ)/T /Z λ(t in ) ; moreover, since the system is in isolation during the switching process, the time evolution of ρ at t > t in is given by Liouville's equationρ = {H λ , ρ}, where the quantity appearing on the righthand side is the Poisson bracket of H λ and ρ. The evolution law expressed by Liouville's equation is fully deterministic, and a one-to-one mapping exists between each configuration at a generic time t and a configuration φ in at the initial time t = t in . As a consequence, the work accumulated along a trajectory going through a configuration φ at a generic time t is well-defined and equal to Thus, the work accumulated during the evolution starting from t = t in and leading to a final configuration φ at t = t fin is simply w(φ, t fin ), and the average appearing on the left-hand side of eq. (3) can be expressed as Liouville's theorem implies the conservation of the trajectory density in phase space: hence, , so that eq. (7) can be rewritten as If the system remains coupled to a heat reservoir doing the parameter switch (and the coupling of the system to the reservoir is sufficiently small), then this argument can be repeated for the union of the system and the reservoir, which can be thought of as a larger system, that remains isolated during the process. Then, the work performed on the system equals the difference of the total energy, evaluated on the final and on the initial configuration. This difference does not depend on the switching time, therefore it can be evaluated in the ∆t → ∞ limit, in which, as we discussed above, eq. (3) holds. Actually, one can prove that the assumption of weak coupling between the system and the reservoir can be relaxed, if the reservoir is mimicked by a Nosé-Hoover thermostat [26] or a Metropolis algorithm, as is the case in Monte Carlo simulations.

Derivation in the master-equation formalism
Eq. (3) can also be derived using a master-equation approach, and assuming a completely stochastic (rather than deterministic) evolution for the trajectory [22]. Here and in the following, we will use the symbol φ to denote a field configuration of the system, and φ(t) will denote a field configuration at time t. Here, the time evolution of φ is assumed to be given by a stochastic process; as a result of this stochastic process, the field configuration changes with time, and, following ref. [22], we will call this process a "trajectory" in the space of the possible configurations of the system. Let P (φ , t|φ, t + ∆t) denote the conditional probability of finding a field configuration φ at time t + ∆t, given that the system was in configuration φ at time t, and define the instantaneous transition rate from φ to φ as Note that this quantity depends on time only through the time-dependence of λ. Consider now an ensemble of stochastic, Markovian temporal evolutions (or trajectories) of the system, given a certain, fixed time-evolution of its parameters, λ(t): the distribution density of these trajectories in the space of configurations of the system, denoted as f (φ, t), obeys where the last equality is the definition of theR λ operator. If λ does not depend on time, then the formal solution of eq. (10), with the boundary condition that at t = t in the distribution density equals f in (φ), can be written as In this case, φ(t) reduces to a standard, stationary Markov process: then, the distribution density f (φ, t) becomes time-independent and the left-hand side of eq. (10) vanishes. Thus, the Markov process generates an ensemble of configurations distributed according to the canonical Boltzmann distribution for a system with Hamiltonian Eq. (12) means that the canonical distribution is preserved by the Markov process under consideration. Note that, if the Markov process satisfies detailed balance, i.e. if then eq. (12) follows: this can be easily proven by expressing exp[−H λ (φ )/T ]R λ (φ , φ) in terms of exp[−H λ (φ)/T ] and R λ (φ, φ ) using eq. (13), and then summing (or integrating) over the φ values. The converse is in general not true, but, given that the distinction between eq. (12) and eq. (13) is immaterial for our present discussion, for the sake of simplicity we will nevertheless refer to eq. (12) as to the "detailed-balance condition", as was done in ref. [22]. 1 Let us assume that the initial distribution at time t = t in is a canonical one, f in (φ) ∝ exp[−H λ(t in ) (φ)/T ], let Q(φ, t) denote the average value of exp[−w(φ, t)/T ] over all trajectories going through a particular configuration φ at a generic time t. Introducing the distribution defined as the average of exp(−W/T ) over all trajectories can be expressed as From its definition by eq. (15), it is easy to see that the time derivative of g is given by In particular, the third equality appearing in eq. (17) can be proven by imagining that φ(t) represents the "motion of a particle with a time-dependent mass exp[−w(φ, t)/T ]" (this motion is supposed to take place in the space of configurations), so that Q(φ, t) can then be interpreted as the "average mass" of the particles that at time t go through φ(t), and g(φ, t) represents the "average mass density" of the particles that go through φ at time t. The time dependence of such "average mass density" would then be induced by two terms: first, the one due to the the "flow" of these "particles", which is encoded in eq. (10), and, second, by the fact that the particle "mass" The time derivative of g is then given by the sum of these two terms, which yields eq. (17). For another derivation of eq. (17), see ref. [22, appendix A]. Note that eqs. (6) and (15) imply that, at t = t in : where we used the fact that the initial distribution is a canonical one. 1 One can also assume the stronger condition that, when t − tin → ∞, the Markov process always generates a canonical Boltzmann distribution, i.e. that for any, arbitrary, initial distribution fin(φ): so that, for sufficiently long times, the Markov process always leads to thermalization of any distribution. Note that eq. (14) is stronger than and implies eq. (12). For our present purposes, however, only eq. (12) is needed.
According to eq. (12),R λ annihilates N exp(−H λ /T ) (where N is an arbitrary constant factor), hence: which means that N exp(−H λ /T ) is solution to eq. (17). The solution consistent with the boundary condition specified by eq. (18) has N = 1/Z in , so that Plugging eq. (20), evaluated at t = t fin , into eq. (16), one finally obtains which proves Jarzynski's theorem. Note that, even though the distribution of φ is a canonical one only at t = t in , in the last term of eq. (21) the canonical partition function of the system at the final value of λ appears, and that this equation relates a genuinely out-of-equilibrium quantity (the average appearing in the first term) to a ratio of equilibrium quantities.
This proof of Jarzynski's equality provides a natural way to implement a numerical evaluation of the free-energy difference appearing on the right-hand side of eq. (3) by Monte Carlo simulation: 2 having defined a parameter evolution λ(t), with t in ≤ t ≤ t fin , that interpolates between the initial and final ensembles, and starting from a canonical distribution of configurations, one can drive the system out of equilibrium by varying λ as a function of Monte Carlo time, letting the configurations evolve according to any Markov process that satisfies the detailed-balance condition expressed by eq. (12), and compute exp(−W/T ) during this process. The average expressed by the bar notation on the left-hand side of eq. (3) is then obtained by averaging over a sufficiently large number of such trajectories. This is the numerical strategy that we use in this work, in which the Euclidean action S plays the rôle of H/T . We close this section with a word of caution. The computational efficiency of this method may strongly depend on the properties of the system under consideration: in particular, physical systems with a very large number of degrees of freedom (such as quantum field theories regularized on a spacetime lattice) have sharply peaked statistical distributions, hindering an accurate sampling of the configuration-space regions that contribute mostly to exp(−W/T ). If the different values of W in different trajectories are much larger than the scale of typical thermal fluctuations (or of typical quantum fluctuations, for lattice simulations of quantum field theory), then exp(−W/T ) is dominated by configurations in which the value of W is much smaller than W , and an accurate determination of exp(−W/T ) may require a prohibitively large number of trajectories. Note, however, that, in the numerical calculation of free-energy differences by eq. (3), there exists a remarkable difference in the rôles of the initial and final ensembles: one assumes that the initial configurations are thermalized, while the field values at all t > t in (including, in particular, at t = t fin ) are out of equilibrium. This asymmetry between the initial and target ensembles implies that, if the Monte Carlo determination of ∆F is biased by effects due to limited statistics, then carrying out the same calculation in the opposite direction will, in general, give a result different from −∆F . Conversely, verifying that a "direct" and a "reverse" computation give consistent results, provides a powerful test of the correctness of the calculation. This is a test that all results of our present work pass with success.

Lattice calculation of the SU(3) equation of state
In this work, we investigate the behavior of QCD at finite temperature, and compute the equation of state via lattice simulations using an algorithm based on Jarzynski's equality eq. (3).
In particular, we focus on the pure-glue sector, which captures the main feature of thermal QCD at the qualitative level: the existence of a confining phase at low temperatures, in which the physical states are massive color singlets, and a deconfined phase at high temperatures, in which chromoelectrically charged, light, elementary particles interact with each other through screened, long-range interactions. 3 Thermal screening of both electric and magnetic field components is, indeed, a characterizing feature of the deconfined phase of non-Abelian gauge theories, which defines it as a "plasma". Asymptotic freedom implies that, when the temperature T is very high, the physical coupling g at the scale of thermal excitations, O(T ), becomes small; in this limit, chromoelectric fields are screened on distances inversely proportional to gT , while chromomagnetic fields are screened on lengths inversely proportional to g 2 T , so that the theory develops a well-defined hierarchy of scales, between "hard" (of the order of T ), "soft" (of the order of gT ), and "ultra-soft" (of the order of g 2 T ) modes, and this separation of scales allows for a systematic treatment in terms of effective theories [29][30][31][32]. The appearance of the soft and ultra-soft scales is due to the existence of infra-red divergences, which lead to a breakdown of the correspondence between the number of loops in Feynman diagrams and the order in α s in perturbative calculations [33], and to the intrinsically non-perturbative nature of long-wavelength modes at all temperatures. Moreover, for plasma excitations on the energy scale of the deconfinement temperature, the physical coupling is not very small, so that the deconfined state of matter cannot be reliably modeled as a gas of free partons.
For these reasons, the study of the equation of state of QCD-or of its gluonic sector, that we are focusing on here-close to deconfinement requires non-perturbative techniques. We carry out this study by discretizing the Euclidean action of SU(3) Yang-Mills theory on a hypercubic lattice Λ of spacing a, spatial volume V = L 3 = (aN s ) 3 and extent aN t along the compactified Euclidean-time direction, using the Wilson action [5] where β = 6/g 2 0 , with g 0 the bare coupling, and The partition function of the lattice theory is given by (where dU µ (x) is the Haar measure for the SU(N ) matrix defined on the oriented link from site x to site x + aμ) and expectation values are defined as The integrals on the right-hand side of eq. (25) are estimated numerically, by Monte Carlo integration, from a sample of field configurations produced in a Markov chain; our update algorithm combines one heat-bath [34] and five to ten over-relaxation steps [35] on the link variables of the whole lattice: this defines a "sweep". The uncertainties in these simulation results are estimated with the jackknife method [36]. The physical temperature of the system T = 1/(aN t ) is varied by varying a, which, in turn, can be continuously tuned by varying β: to this purpose, we set the scale of our lattice simulations by means of the Sommer scale r 0 [37] as determined in ref. [38]. The critical temperature is related to r 0 by T c r 0 = 0.7457(45) [18]. 4 Our lattice determination of the equation of state rests on the following thermodynamic identity, relating the pressure p to the free energy per unit volume f = F/V , which holds in the thermodynamic limit, V → ∞, and receives negligible corrections for the L and T values considered here [16,41]. Following the algorithmic strategy discussed in ref. [24] for a benchmark study in the SU(2) theory, we study how the dimensionless p(T )/T 4 ratio varies as a function of the temperature, starting from an initial temperature T in : In our simulations, we compute Z(T )/Z(T in ) by means of Jarzynski's equality, using β (by tuning which, as stated above, the temperature can be varied continuously) as the λ parameter: β is let evolve linearly with the Monte Carlo time t between the initial (β in ) and final (β fin ) values corresponding to T in and T , respectively. More precisely, the β interval is discretized in N equal intervals of width ∆β, so that β n = β in + n(β fin − β in )/N = β in + n∆β. Finally, one should remember that the p(T ) and p(T in ) terms appearing on the left-hand side of eq. (27) also include contributions from quantum (non-thermal) fluctuations, that depend on the lattice cutoff and diverge in the continuum limit. These contributions can be removed from p by evaluating the quantity appearing on the right-hand side of eq. (27) on a lattice of large hypervolume (aN 0 ) 4 at T = 0 at the same a. This leads us to define the physical, renormalized pressure as where ∆S is the variation in Euclidean action during a non-equilibrium trajectory in configuration space: the N 3 s × N t and N 4 0 subscripts respectively indicate that this quantity is evaluated on a finite-or on a zero-temperature lattice, γ = N 3 s × N t /N 4 0 , and the bar denotes the average over a sample of n traj non-equilibrium trajectories, which start from canonically distributed initial configurations {U (t 0 )}. We remark that, in each of these trajectories, only the initial configuration is thermalized; then one starts driving the system out of equilibrium (by varying its parameters, in this case β) and all subsequent configurations that are produced during the same trajectory are not let thermalize.
Note that the summands on the right-hand side of eq. (29) are given by the action difference induced by a variation of β on the same field configuration. In practice, in order to scan a wide temperature range, from the confining to the deconfined phase, it is more convenient to divide the temperature interval in a number (that we denote as n int ) of smaller intervals. In particular, we choose these intervals in such a way that they do not stretch across different phases: this allows us to get rid of potential difficulties that might arise in the numerical sampling of configurations, when the algorithm tries to probe the physics at T > T c , by driving configurations in the T < T c phase out of equilibrium, without letting them thermalize. 5 Dividing the β range of interest in a different number of intervals that do not cross the phase transition should lead to the same physical results, but n int has some effect on the numerical efficiency of the simulation algorithm. In particular, smaller values of n int (i.e. broader intervals in β) typically require larger values of N and more statistics. On the other hand, larger n int implies a larger overhead for thermalization of the initial configurations at the start of each transformation (in this work we used 5000 full thermalization sweeps at T = 0 and 15000 at finite temperature).
We run our simulations on lattices with N t = 6, 7, 8 and 10 and for N s > 12N t (and typically N s 16N t ), according to the parameters listed in table 1, where n traj = 10 throughout, and n conf denotes the total number of configurations used for each combination of parameters, given by the sum of the N · n traj products over all n int intervals. These calculations were carried out on the A1 Intel Broadwell partition of the MARCONI tier-0 supercomputer of the Italian CINECA consortium, a Lenovo system. The total number of core-hours to produce the numerical results presented in this work was approximately 9 × 10 5 .  The pressure p is the primary thermodynamic observable that we compute using Jarzynski's equality, according to eq. (28): the results at the different values of N t are shown in fig. 1.
From the results for p/T 4 at finite lattice spacing, we take the continuum limit by first interpolating them, for each N t , through cubic splines, and then by fitting the splines at fixed values of T with a constant-plus-linear-term fit in 1/N 2 t : This defines α(T ) as the continuum-extrapolated value of the pressure at that temperature. Different types of interpolations at fixed N t , or more complicated functional forms than the one in eq. (30), yield compatible results. As the starting value for p/T 4 at T in = 0.7T c , we use p(T in )/T 4 in = 0.00086, the analytical result for a glueball gas [42] (for a thorough discussion, see also refs. [43] and references therein). Our results for p/T 4 obtained in this way are shown in figure 2, in comparison with those from refs. [12,17]. Other basic thermodynamic observables, like the trace of the energy-momentum tensor ∆, the energy per unit volume , and the entropy per unit volume s are directly related to the pressure by basic thermodynamic relations: To compute the trace of the energy-momentum tensor, we first fit our continuum values for   [17] (red squares) and with those obtained using the moving-frame method in ref. [12] (blue triangles). The results in the confining phase are displayed in the inset plot.
p/T 4 in the temperature range T c ≤ T ≤ 2.5T c , to the following rational function of w = ln(T /T c ): The fit gives p 1 = 0.0045(35), p 2 = 1.76 (12), p 3 = 10.6(2.2), p 4 = 2.07 (47), and p 5 = 5.8(1.1), with a reduced χ 2 equal to 0.33. Deriving the function on the right-hand side of eq. (34), we obtain the results for the trace of the energy-momentum tensor shown in fig. 3, where we compare them with those that have been recently obtained by different groups, using other methods [9,12,17].   [17] (red squares), with those obtained using the moving-frame method in ref. [12] (blue triangles), and with those computed using the gradient-flow method in ref. [9] (orange diamonds).
Finally, the energy density and the entropy density are simply obtained using eq. (32) and eq. (33), respectively: the results are shown in figs. 4 and 5.
The complete set of our continuum-extrapolated results for p/T 4 , ∆/T 4 , /T 4 , and s/T 3 is reported in table 2.

Discussion and concluding remarks
The results presented in section 3 deserve several relevant comments, which are separately discussed in each of the following subsections.

Universality of lattice results
First and foremost, the comparison of our data, obtained with an algorithm based on Jarzynski's equality, with those from previous works [9,12,17] provides a striking check of the expected universality of lattice results: the fact that the high-precision results obtained by four independent groups, using remarkably different computational strategies, are essentially compatible with each  other, indicates that all sources of systematic or statistical uncertainties are under control, and confirms that lattice calculations provide solid, first-principle results for the thermodynamics of strong interactions in the temperature range probed by heavy-ion collision experiments.
Looking at the fine details, however, one can also see that some slight tension between the results obtained with different methods still persists. For example, the results for the various thermodynamic observables reported in ref. [17] appear to be systematically lower than the others. This effect is most visible for the trace of the energy momentum tensor in figure 3, while it is essentially absent in the results for the pressure shown in figure 2 (whereas the energy and entropy densities, being obtained from linear combinations of p and ∆, exhibit an intermediate behavior, with milder tensions). Also, the discrepancy appears to be largest in the temperature region of  (23)   the peak in ∆/T 4 , where it is comparable (in sign and magnitude) with the difference from the results from ref. [14] that was reported in ref. [17] itself. While the origin of this slight difference between the results of ref. [17] and the others remains unclear, 6 it should be remarked that it is quantitatively modest, and does not change the overall physical picture of the SU(3) equation of state in a significant way.

Physical implications of the results
In terms of physics, these results confirm that, in the temperature range relevant for collider experiments, the thermodynamics of SU(3) Yang-Mills theory is dominated by non-perturbative effects, and far from the ideal limit of a gas of free gluons. In particular, the equilibrium observables considered here are significantly different from their Stefan-Boltzmann values: which are reached only in the T → ∞ limit, and approached logarithmically slowly as the temperature is increased. A way to study the values for these quantities at high, but finite, temperatures, is by means of thermal perturbation theory. Weak-coupling expansions for the pressure of QCD (and pure-glue Yang-Mills theory) have a long history: the leading-order correction, O(g 2 ), was worked out forty years ago [45]. Soon thereafter, however, it was realized that perturbative expansions in thermal non-Abelian gauge theories have non-trivial features: in particular, the existence of infrared divergences, which have to be resummed, leads to the appearance of terms proportional to odd powers and/or logarithms of g, and, most importantly, implies that, at some finite order, an infinite number of Feynman diagrams, of arbitrarily complicated topologies, will contribute [33]. This "Linde problem" leads to the peculiar situation, in which the number of sensible perturbative orders is finite. For the pressure, this problem occurs at O(g 6 ), or four loops, and the program of computing all perturbative contributions up to that order has been completed, with the determination of all terms O(g 3 ) [46], O(g 4 ln g) [47], O(g 4 ) [48], O(g 5 ) [49], and finally O(g 6 ln g) [32], but the convergence of the perturbative series is known to be very slow [31]. In particular, truncating the perturbative series at subsequent orders results in a strongly oscillating behavior in the temperature range probed in heavy-ion collisions. As we already mentioned above, dimensional reduction provides an elegant way to systematically account for the non-perturbative physics related to infrared divergences, by means of effective theories [29] that can be studied non-perturbatively on the lattice [50] (an approach that has recently found useful applications even for real-time phenomena in hot QCD [51]).
The limited convergence of weak-coupling expansions for thermodynamic quantities in finitetemperature QCD is due to the fact that characteristic phenomena of plasmas, such as screening and Landau damping, must be properly accounted for. To this purpose, one can re-arrange the perturbative expansions using a hard-thermal-loop approach [52] (in which the Debye mass m D in the "improvement term" added to the Lagrangian is, in principle, arbitrary, and must be fixed in a self-consistent way).
In any case, the intrinsically non-perturbative nature of the physics of high-temperature non-Abelian gauge theories makes it hardly surprising that leading-order weak-coupling expansions provide an unsatisfactory description for the equation of state of strong interactions, even at high temperatures. While various phenomenological models (including bottom-up models based on the gauge-gravity duality [53]) describe well the thermodynamics of the quark-gluon plasma at temperatures close to deconfinement [54], lattice calculations remain the most reliable first-principle theoretical tool to study thermal QCD under the conditions probed in heavy-ion collisions.

Computational efficiency of the algorithm
The algorithmic strategy proposed in ref. [24] and based on Jarzynski's equality [21,22] provides a robust and efficient tool to compute the equation of state non-perturbatively on the lattice. As we mentioned above, its implementation in Monte Carlo calculations only requires that the Markov process satisfies detailed balance, and the assumption that the initial starting configurations (not those at subsequent Monte Carlo times) are thermalized. It is also interesting to observe that, as we pointed out in section 3, in our computations we only used n traj = 10 trajectories for each of the N t and N s combinations of values and each of the n int temperature intervals in the finite-T simulations (and for the corresponding ones at T = 0). This means that, out of the total number of configurations from which we could extract data for each combination of parameters (which is denoted as n conf in table 1, and equals the sum of the N · n traj products over the n int intervals), only a very small number n int · n traj required thermalization.
It is important to discuss the main factors determining the computational efficiency of the algorithm. The key aspect of our algorithm is the exponential average appearing in eq. (3): this implies that, if the typical amplitude of fluctuations in W/T from one trajectory to another is large, then the quantity appearing on the left-hand side of eq. (3) receives its dominant contributions from trajectories in the tail of the distribution of values of W/T , and its accurate estimate by Monte Carlo methods would require a prohibitively large number of trajectories. In the present context, W/T is replaced by the total variation in Euclidean action S along a trajectory, see eq. (29). Since S (and, as a consequence, ∆S) is an extensive quantity, one may expect it to be practically impossible to obtain accurate results on large lattices: in particular, the typical fluctuations in the exponent of eq. (3) will scale like the square root of the lattice hypervolume, making the evaluation of p(T ) nearly unattainable on all lattices, except for very coarse ones. Note that this is the same argument by which the sign problem affecting lattice QCD calculations at finite quark chemical potential µ [55] cannot be solved by the reweighting method [56]. In fact, the reweighting method is a special case of our algorithm, which reduces to it for N = 1. Nevertheless, with our algorithm it is possible to take the fluctuations in ∆S under control even on a lattice of arbitrarily large hypervolume V, for example simply by scaling N proportionally to V. This is so, because the fluctuations in ∆S from one trajectory to the other result from the sum of the fluctuations in the summands on the right-hand side of eq. (29): assuming that the latter are uncorrelated with each other, when N grows (at fixed V), the fluctuations in ∆S will be suppressed like 1/ √ N (so that, in particular, in the "quasi-static limit" N → ∞ the field configurations remain in equilibrium throughout their evolution from the initial to the final ensemble, and ∆S is exactly equal to the logarithm of Z in /Z fin on all trajectories), thus by making N scale with V, the two, opposite effects on the size of the fluctuations in ∆S can compensate each other.
A convenient practical way to test whether the numerical results obtained using our algorithm for finite statistics are biased by poor sampling of the distribution of ∆S values, consists in running the simulation in the direct (from λ in to λ fin ) and in the reverse (λ fin → λ in ) direction. If N is too small, then the fluctuations in ∆S from one trajectory to the other can be very large, and a numerical estimate of the average on the left-hand side of eq. (3) will be determined by a small number of configurations in one of the tails of the distribution, which is very difficult to sample in an accurate way. This will then induce a systematic bias in the numerical results. By carrying out the computation in the reverse direction, the same effect will occur for the variation in Euclidean action induced by a λ fin → λ in transformation, but this time for a different distribution, resulting in a generally different bias of numerical results. Thus, an inconsistency in the numerical results obtained from simulations starting from λ in or from λ fin provides a useful detector of poor-sampling effects.
Note that the mutual consistency of results obtained from a calculation in the direct and in the reverse direction is not a sufficient but a necessary condition for the correctness of the result. All results in our present work pass this test.
More in general, a systematized study of statistical and systematic uncertainties, with the goal of algorithm optimization, can rest on the mathematical theory that has been developed over several years, for generic Monte Carlo calculations using Jarzynski's equality in statistical mechanics. The "good practices" underlying simulations with non-equilibrium work methods are by now well-established, and are encoded in formulas related to the deep connections between statistical mechanics and information theory [57]. For a detailed discussion of the computational efficiency of algorithms based on Jarzynski's equality, see refs. [58,59].
It is interesting to compare the numerical efficiency of our algorithm with other computational methods, that have been used in the literature to evaluate the QCD equation of state on the lattice. Among the three recent works that we directly compared our results with [9,12,17], the one reported in ref. [17] is based on the most similar method, i.e. the integral method [8]. Like for the integral method, our determination of the equation of state is based on the p = −f identity, and requires the numerical subtraction of contributions from quantum, non-thermal fluctuations that would make the free-energy density divergent in the continuum limit a → 0. Exactly like for the integral method, this ultraviolet divergence can be removed by subtracting the free-energy density evaluated at T = 0 and at the same lattice spacing: see the subtrahend in the brackets on the right-hand side of eq. (28). Thus, our method, in itself, does not allow one to bypass the need of renormalization arising in the integral method [8], as it relies on the same vacuumcontribution subtraction. However, an important difference between our method and the integral method is that, while in the latter all field configurations produced at intermediate temperatures More quantitatively, as we mentioned above, the thermalization that in this work was used for the configurations at the initial β values consisted of n therm = 1.5 × 10 4 sweeps (where, as we mentioned above, by "sweep" we mean the combination of one heat-bath [34] and five to ten overrelaxation updates [35] on all link variables of the lattice) for the lattices at finite temperature, and of n therm = 5 × 10 3 sweeps for those at T = 0. Naïvely, if one were to make a comparison with a computation of the equation of state based on the integral method [8] using the same number of configurations for each data set (the parameter n conf reported in table 1), the fact that in our calculation the intermediate configurations need not to be thermalized, would imply a very large reduction in CPU time, by a factor of the order of N , the number of steps one trajectory consists of. This estimate of the computational-cost reduction, however, neglects the inherently different nature of the field configurations that are used by the two algorithms. The point is that, while the integral algorithm only uses thermalized configurations, and extracts information on the thermodynamic equilibrium ensemble they belong to, a computation based on Jarzynski's equality attempts to extract information from configurations that are not typical ones of the "target" equilibrium ensemble (the one specified by the partition function Z fin ): in fact, most of them are not typical configuration of any equilibrium ensemble, since, by definition of the algorithm, they are not required to thermalize. In practice, the algorithm "tries to sample" the target equilibrium ensemble by progressively driving the thermalized configurations of the initial ensemble towards the target ensemble. As remarked above, if the fluctuations in ∆S are too large, then such sampling becomes computationally very demanding (like in the reweighting method) and exponentially increasing statistics is required for a given level of precision: this is a general feature of all Monte Carlo algorithms based on Jarzynski's equality, which was shown and discussed in mathematical detail in refs. [58] and, more recently, in refs. [59], and we refer the interested readers to those references. Note that large fluctuations in ∆S may occur when N is small, when β fin − β in is large, when for β = β in and for β = β fin the system is in two different phases, 7 or when the number of degrees of freedom is large (including, in particular, when the volume is large); the fact that the fluctuations in ∆S become large when β fin − β in is large (or, more precisely, when the equilibrium statistical ensembles respectively corresponding to β = β in and to β = β fin have little overlap) implies that a proper sampling of such "long" trajectories requires higher statistics. Conversely, if the number of steps in each trajectory is increased to large values (with the initial and final parameters fixed), then the simulation proceeds through a sequence of steps which are "only slightly" out of equilibrium, and for infinite N the simulation goes through a sequence of configurations in thermal equilibrium.
In order to further clarify the meaning of the non-equilibrium transformations used in simulations based on Jarzynski's theorem, it is instructive to look at examples of the distributions for ∆S, the total Euclidean-action variation during a non-equilibrium trajectory, defined by eq. (29), that can be obtained in simulations starting from the same initial ensemble (at equilibrium), aimed at the same target ensemble, and with a similar computational cost, but with different values of N . To this purpose, in fig. 6 we show the density of probability p of observing a variation of Euclidean action ∆S, defined by eq. (29), as obtained from two different simulations on a finite-T lattice with N t = 6 and N s = 96. More precisely, the histograms display the probability distribution in terms of "left-stairs" columns, associated with bins of width 1.25, whose total area is normalized to one.
For both calculations, β in = 6.17921 and β fin = 6.13671, and also the number of configurations used and the total CPU time that was needed to produce them are comparable, but for one of them (whose results are denoted by red histograms) the (β fin − β in ) interval was split into 425 intervals, with ∆β = −10 −4 , while for the other one (represented by the green histograms) ∆β was ten times smaller, and N was equal to 4250.
The fact that, in the latter case, ∆β is much closer to zero implies that the simulation proceeds through a sequence of configurations which are driven out of equilibrium very slowly. As a consequence, one expects the observed distribution of ∆S values to be close to a very narrow, Gaußian-like distribution centered around the free-energy difference (in units of T ) between the two equilibrium ensembles corresponding to β = 6.17921 and β = 6.13671, that one could compute by standard Monte Carlo calculations at equilibrium on this finite lattice. Indeed, the green histogram plotted in fig. 6 does confirm this expectation: the distribution is sharply peaked around a value of ∆S close to 825740-a value which, unsurprisingly, is fully compatible with the value of ∆F/T extracted from this simulation using our algorithm based on Jarzynski's theorem: ∆F/T = 825740.3 ± 0.5. In fact, it is trivial to observe that, when the probability density of ∆S values tends to a very sharply peaked, nearly δ-like, distribution, then the value of ∆F/T obtained from eq. (3) (in which, as we stated above, ∆S plays the rôle of W/T ) coincides with the value of ∆S at which the peak is located. 8 Much less trivial, however, is the fact that exactly the same result is obtained (within statistical uncertainties) when ∆F/T is calculated by Jarzynski's theorem through the former sample of trajectories, i.e. those obtained with N = 425 and a significantly larger ∆β = −10 −4 . In this case, β is let interpolate from β in to β fin at a rate that is ten times faster than in the previous case: as a consequence, the configurations generated by the Monte Carlo along each trajectory are driven out of equilibrium much more briskly, and, in general, the ∆S values computed in each non-equilibrium trajectory will fluctuate more wildly (and, in general, in a non-universal, and not trivially predictable, way). Once again, this is clearly visible in our data: the red histogram shows that in this case the distribution of ∆S values is quite broad, and appears to have a nontrivial structure (even featuring secondary peaks, etc.). From the plot, one also notes that this distribution takes its largest values in the (approximate, and poorly defined) range of ∆S between 825753 and 825761. Remarkably, however, the result for ∆F/T obtained using Jarzynski's theorem with this set of trajectories is ∆F/T = 825741.5 ± 4.1, which is very far from the interval where this p(∆S) is largest, and perfectly compatible with the one obtained from the set of trajectories with N = 4250, that are much closer to equilibrium! It is also worth noting that this result for ∆F/T has very high precision, of the order of a few per million, comparable with the one achieved in simulations near equilibrium, even though it arises from the exponential average of a quantity (the action variation during non-equilibrium trajectories) whose distribution is so broad. Once again, we remark that, while the details of such distribution may be affected by non-universal dynamics of the Monte Carlo, with a sizable impact on results obtained from limited statistics, the determination of ∆F/T through Jarzynski's theorem becomes exact when the algorithm samples p(∆S) to a sufficient level of precision: the equality encoded in eq. (3) allows one to extract equilibrium information from ensembles of configurations out of equilibrium.
For completeness, in fig. 7 we also plot the results obtained from two analogous simulations, carried out in the opposite direction, i.e. starting from thermalized configurations at β = 6.13671, and progressively driving the system out of equilibrium, through a sequence of configuration updates in which β is increased to β = 6.17921: like in the previous case, the distribution of ∆S values obtained at smaller N is the broader and farther from equilibrium one, but the final results for ∆F/T , which in this case are ∆F/T = −825734.7 ± 2.6 for the simulation with N = 425, and ∆F/T = −825740.3 ± 1.1 for the one with N = 4250, are compatible with each other, and with (minus) the results for ∆F/T obtained in the simulations with β in = 6.17921 and β fin = 6.13671, discussed above.
One may also compare our algorithm to compute the pressure, with a variant of the standard integral method combined with the "snake algorithm" defined in ref. [60], whereby a ratio of partition functions of the form Z λ fin /Z λ in is factorized into a telescoping product of the form (with Z 0 = Z λ in and Z N = Z λ fin ), where each (Z n , Z n+1 ) pair appearing in the intermediate ratios describes statistical ensembles with better overlap than Z λ fin /Z λ in . 9 Note that, for later convenience, we assumed the product on the right-hand side of eq. (36) to involve exactly the same number of terms as the sum on the right-hand side of eq. (29), i.e. N . The fundamental idea underlying the snake algorithm is that, when the distributions of configurations associated with Z λ in and Z λ fin are poorly overlapping, so that a Monte Carlo estimate of the Z λ fin /Z λ in ratio to a fixed level of relative precision would require exponentially large statistics, evaluating each of the Z n+1 /Z n ratios appearing on the right-hand side of eq. (36) is computationally much cheaper, provided Z n and Z n+1 always describe ensembles with a good overlap with each other. Then, all Z n+1 /Z n ratios are O(1), and can be evaluated to high precision with a fixed computational cost. The final statistical uncertainty on Z λ fin /Z λ in is eventually obtained by the sum (in quadrature, as they are obtained from independent simulations) of the uncertainties on the Z n+1 /Z n ratios, and does not grow exponentially. If Z λ fin /Z λ in is factorized in exactly N terms (as we assumed in the equation above), and if each of the Z n+1 /Z n ratios is calculated by Monte Carlo methods with n traj configurations, then with the snake algorithm one would be able to determine Z λ fin /Z λ in by producing N · n traj thermalized and uncorrelated configurations. An elementary argument shows that, while with this number of independent configurations the algorithm based on Jarzynski's equality would yield an estimate of Z λ in /Z λ fin from n traj measurements, the snake algorithm would instead express the same quantity as (where (Z n+1 /Z n ) kn denotes the value of the Z n+1 /Z n ratio computed in the k n -th configuration), which, when the product is expanded, corresponds to n traj N measurements. While this naïve counting argument overlooks the rôle of fluctuations, it suggests that, under these conditions (i.e. working with a sample of N · n traj , completely thermalized and fully independent configurations), the snake algorithm would outperform the one based on Jarzynski's equality. This is not surprising: indeed, were the Markov update algorithm perfectly efficient, i.e. capable of generating fully thermalized and decorrelated configurations in a single sweep, then the system would never be out of equilibrium. In that case, our algorithm would not be affected by any overlap problem even if the λ in → λ fin transformation were carried out in a single step: this means that our algorithm could be reduced to the reweighting algorithm, as discussed above, and, with a sample of N · n traj configurations, one could factor the Z fin /Z in ratio into a product of N intermediate ratios, each of which could be computed by reweighting. Clearly, however, it is in the more realistic case of Markov updates that do not produce immediate thermalization, that our simulation algorithm reveals its full potential: in this case, our algorithm turns the fact that the field configurations "lag behind" equilibrium into an advantage, by means of Jarzynski's equality-whereas a computation based on the standard integral method, or on its snake-algorithm variant, would always require thermalized configurations, which would increase its computational cost.
As concerns the two other works that we confronted our results with [9,12], a comparison of the computational costs is far less direct.
In the calculation of the equation of state presented in ref. [9], based on the Wilson flow [61], the energy density and the pressure are extracted from the diagonal components of the renormalized energy-momentum tensor of the theory, which, in turn, is obtained from the behavior of the two dimension-four, gauge-invariant operators defined in terms of the flowed field-strength tensor at short flow time [62]. The calculation involves the numerical solution of the differential equation defining the flowed gauge field, and a double extrapolation, in which the small-flow-time limit has to be taken after the continuum-limit (a → 0) extrapolation.
Finally, in ref. [12], the primary observable to determine the equation of state is the entropy density (in units of T 3 ), which is directly related to the time-space off-diagonal components of the energy-momentum tensor. When the theory is defined in a relativistic moving frame, it is possible to prove a set of Ward-Takahashi identities for the correlators of the energy-momentum tensor, that relate the energy and momentum distributions in the canonical ensemble, and allow one to non-perturbatively renormalize the energy-momentum tensor [63]. In practice, this multiplicative renormalization of the energy-momentum tensor is encoded in a finite function of the bare coupling, which has to be determined independently.

Further applications of the algorithm based on Jarzynski's equality
Extending our algorithm to calculations including dynamical quark flavors is straightforward, and we plan to implement it in code for lattice simulations of full QCD in future work. In this respect, it would be interesting to compare the efficiency of this algorithm to different calculations of the QCD equation of state [7,13,64].
Another direction, in which the present work can be generalized, consists in applying the Jarzynski's equality to lattice calculations of different physical observables. The computational strategy based on this algorithm, indeed, is quite general and versatile, and not restricted to the thermodynamics domain. As a benchmark study, a determination of the interface free energy was presented in ref. [24]; the extension to other quantities, like the running coupling of the strong interaction in the Schrödinger-functional scheme [65] and the entanglement entropy in lattice gauge theory [66], is under way. N t = 6, N s = 96, β in = 6.17921, β fin = 6.13671 Figure 6: Histograms of the probability density p(∆S) for a variation ∆S in Euclidean action during non-equilibrium trajectories, as defined in eq. (29), in two different calculations at finite temperature, on a lattice with N t = 6 and N s = 96. In both simulations, the starting configuration in each of the trajectories is drawn from an equilibrated distribution at β = β in = 6.17921, and is let evolve through a sequence of non-equilibrium transformations, during which the β parameter is decreased down to β fin = 6.13671. In the first simulation (red histogram), the non-equilibrium transformations were carried out by dividing the (β fin − β in ) interval into N = 425 sub-intervals of amplitude |∆β| = −10 −4 and 300 trajectories were produced, whereas in the second simulation (green histogram), the (β fin − β in ) interval was divided into N = 4250 sub-intervals of amplitude |∆β| = −10 −5 , and the number of trajectories was n traj = 40. The estimates for the free-energy difference (in units of the temperature) obtained using the algorithm based on Jarzynski's equality from these two simulations are ∆F/T = 825741.5 ± 4.1 for the calculation with N = 425, and ∆F/T = 825740.3 ± 0.5 for the one with N = 4250.