Variational study of mass generation and deconfinement in Yang-Mills theory

A very simple variational approach to pure SU($N$) Yang-Mills theory is proposed, based on the Gaussian effective potential in a linear covariant gauge. The method provides an analytical variational argument for mass generation. The method can be improved order by order by a perturbative massive expansion around the optimal trial vacuum. At finite temperature, a weak first-order transition is found (at $T_c\approx 250$ MeV for $N=3$) where the mass scale drops discontinuously. Above the transition the optimal mass increases linearly as expected for deconfined bosons. The equation of state is found in good agreement with the lattice data.


I. INTRODUCTION
In the last decades the dynamics of QCD has been under intensive theoretical study, aimed at understanding the properties of matter under the extreme conditions reached by heavy-ion collisions. Our understanding of the phase diagram has further motivated the study of pure SU(N ) Yang-Mills theory in the IR and at finite temperature, neglecting quarks as a first approximation. However, despite the important progresses made, we still miss an analytical description of SU(N ) theory from first principles, because of the breaking down of standard perturbation theory below the QCD scale.
The numerical simulation of the theory on a lattice has provided many important insights into the gluon dynamics. Among them, the dynamical generation of a gluon mass in the dressed propagator in the Landau gauge [1][2][3][4][5][6][7][8] and the occurrence of a phase transition with the gluons that become deconfined above the critical temperature [9][10][11]. However, since the numerical simulations can only provide data in the Euclidean space, no direct information can be gained in the Minkowski space where the dynamical properties of the gluon are defined. For instance, no direct proof of confinement can be obtained on the lattice and even the definition of mass can only be regarded as an energy scale without any clear dynamical meaning.
On the other hand, effective models have been studied analytically, but they are not from first principles and are usually based on some modified quantization procedure [26][27][28][29] or different Lagrangians. For instance, adding a gluon mass to the Lagrangian is enough for extending the validity of perturbation theory down to the deep IR, yielding a very good overall picture of Yang-Mills theory at one loop [30][31][32]. In the context of background field methods the added gluon mass has pro-vided a good description of the phase diagram at finite temperature, enforcing the idea that most of the nonperturbative effects can be embedded in the gluon-mass parameter [33][34][35][36]. While those models are important for understanding the physics of gluons, there is a growing interest in the study of analytical approaches to the exact SU(N ) theory.
In this paper, we discuss a very simple variational approach to SU(N ) theory, based on the Gaussian effective potential (GEP) in a linear covariant gauge. We do not modify the original Lagrangian of the theory but optimize the perturbative expansion by a variational argument, yielding a calculational analytical method that already provides very important predictions at the lowest orders of the approximation. Among the main results achieved by the present study we mention: i) a variational argument for mass generation; ii) the prediction of a first-order deconfinement transition at T c ≈ 250 MeV for N = 3; iii) the formal definition of a perturbative expansion around the optimized vacuum, allowing for an order-by-order improvement of the approximation.
The original approach of Ref. [37] is here improved and extended to finite temperature, yielding analytical results up to a one-dimensional numerical integration that is required for the thermal functions. The perturbative expansion around the vacuum turns out to be the massive expansion developed in Refs. [38][39][40][41] which was found in excellent agreement with the lattice data [42]. Thus, the present study enforces the validity of that expansion and provides a variational argument for its derivation. Moreover, while by itself the massive expansion cannot give a genuine proof of mass generation, the variational nature of the GEP can be used as a tool for demonstrating that a massless gaussian vacuum of Yang-Mills theory is unstable against the vacuum of massive gluons [37].
The expansion has been extended to finite temperature in Ref. [41] allowing for a direct calculation of the gluon damping rate in the IR and providing a direct proof of confinement. While in that study the zeroth order mass parameter was kept fixed, at finite temperature the GEP provides the free energy and allows us to determine the trial mass parameter variationally, as a function of tem-perature. The optimal mass scale is found discontinuous at the deconfinement transition, leading to an enhancement of the mass decrease that was already found in Ref. [41], in agreement with the observed behavior of the Debye mass in lattice simulations [10].
The GEP is the energy density of a trial Gaussian vacuum functional that is centered at a given average value of the field. The width of the functional is given by the mass of the trial free theory and is determined variationally at each value of the average field, yielding an effective potential that has been studied by several authors, mainly in the context of spontaneous symmetry breaking and scalar theories . While the GEP is a genuine variational method [46,47], several extensions to higher orders have been proposed [56][57][58][59]. The idea of an expansion around the optimized vacuum of the GEP is not new [65] but has not been developed further. Expanding around the optimized massive vacuum of the GEP, the unconventional massive expansion of Refs. [38][39][40] is recovered in a natural way [37]. Thus, the phenomenological success of the expansion might be due to the variational choice of a zeroth order vacuum which incorporates most of the non-perturbative effects, leaving a residual interaction term that can be treated by perturbation theory.
One of the important merits of the GEP is its paradox of being a pure variational method disguised as a perturbative calculation, making use of the standard graphs of perturbation theory. Moreover, in the present context, the calculation is highly simplified by the assumption that the average of the gauge field is zero at the minimum of the potential. In other words, we only need the effective potential at its minimum where it is a function V (m) of the trial mass parameter m. However, at variance with perturbation theory, the issue of renormalization is less standardized in a variational method and the regularization of the diverging integrals becomes a central aspect of the calculation.
The paper is organized as follows: in Sec.II the general formalism is discussed in the simple case of a scalar theory where standard well known results are recovered by the method; in Sec.III the delicate issue of regularization of the diverging integrals and renormalization of the GEP is addressed; in Sec.IV the GEP for pure SU(N ) Yang-Mills theory is studied at T = 0, providing a simple variational argument for mass generation; in Sec.V the GEP is extended to finite temperature and the phase transition is discussed; a general discussion and a summary of the results follow in Sec.VI.

II. GEP AND MASS GENERATION IN THE SCALAR THEORY
In order to illustrate the method, in this section we revise the formalism for the simple case of a self-interacting scalar theory [46] where the effective potential is well known and is given by three vacuum graphs as shown in Fig. 1. The renormalization scheme will be discussed in the next section. Most of the arguments developed here are quite general and will be used in the rest of the paper.
Let us consider the Lagrangian where m B is a bare mass. We can split the total Lagrangian as L = L 0 + L int where the trial quadratic part is and describes a free scalar particle with a trial mass m = m B . The new interaction follows as so that the total Lagrangian is left unchanged. If we neglect the interaction, then a free Hamiltonian H 0 is derived from L 0 and its ground state |m satisfies and depends on the trial mass m. Restoring the interaction L int , the full Hamiltonian reads H = H 0 + H int and by standard perturbation theory, the first-order energy of the ground state reads and is equivalent to the first-order effective potential V 1 (m) evaluated by perturbation theory in the covariant formalism with the interaction L int . Thus, the stationary condition gives the best value of m that minimizes the vacuum energy of the ground state |m . While being a pure variational method, the first-order effective potential V 1 (m) = E 1 (m) can be evaluated by the sum of all the vacuum graphs up to first order (the three loop graphs in Fig. 1). The resulting optimized effective potential is the GEP. Usually, the effective potential is evaluated for any value of the average ϕ = φ and the best m also depends on that average. If the SU(N) Figure 1: Vacuum graphs contributing to the GEP for the scalar theory (first row) and pure SU(N ) Yang-Mills theory (second row).
symmetry is not broken, then the minimum of the effective potential is at ϕ = 0 where V 1 (m) is a function of the trial mass, to be fixed by the stationary condition Eq. (6). We assume that the gauge symmetry is not broken in Yang-Mills theory so that V 1 (m) at ϕ = 0 is the effective potential we are interested in. The variational nature of the method ensures that the true vacuum energy is smaller than the minimum of V 1 (m). At the minimum, |m provides an approximation for the vacuum and is given by the vacuum of a free massive scalar particle with mass equal to the optimized mass parameter m = m B . Of course, the optimal state |m is just a first approximation and the actual vacuum is much richer. However, we expect that a perturbative expansion around that approximate vacuum would be the best choice for the Lagrangian L, prompting towards an expansion with an interaction L int and a free part L 0 that depend on m and can be optimized by a clever choice of the parameter m. Different strategies have been proposed for the optimization, ranging from the stationary condition of the GEP, Eq.(6), to Stevenson's principle of minimal sensitivity [66]. A method based on the minimal variance has been recently proposed for QCD and other gauge theories [57,[67][68][69][70][71]. In all those approaches, the underlying idea is that an optimal choice of m could minimize the effect of higher orders in the expansion. Since the total Lagrangian does not depend on m, the physical observables are expected to be stationary at the optimal m, thus suggesting the use of stationary conditions for determining the free parameter. As a matter of fact, if all graphs were summed up exactly, then the dependence on m would cancel in the final result, so that the strength of that dependence measures the weight of the neglected graphs at any order.
Leaving aside the problem of the best choice of m, we observe that at ϕ = 0 the calculation of the firstorder effective potential V 1 (m) is quite straightforward and follows from the first-order expansion of the effective action Γ(ϕ) where the functional integral is the sum of all one-particle irreducible (1PI) graphs and S = S 0 + S int is the action. The effective potential then follows as V (m) = −Γ(0)/V 4 where V 4 is a total space-time volume. The sum of graphs up to first order gives the first-order effective potential V 1 (m) which is the GEP when optimized by Eq.(6). At finite temperature, the effective potential is replaced by a density of free energy F(T, m) according to where the action S = S 0 + S int is the integral over imaginary time τ β = 1/T and V 3 is a total three-dimensional space volume. The perturbative expansion of the free energy follows by the same connected graphs contributing to the effective potential, with loop integrals replaced by a sum over discrete frequencies and a three-dimensional integration. In the limit T → 0 the effective potential is recovered as V (m) = F(0, m) and each thermal graph gives the corresponding vacuum term. Because of the one to one correspondence of the graphs we can easily switch from the thermal to the vacuum formalism when required. Moreover, at finite temperature, the GEP maintains its genuine variational nature. In the Hamiltonian formalism, the variational argument that follows Eq.(5) can be generalized by Bogolubov's inequality while in the Lagrangian formalism the same result is found by Jensen-Feynman inequality where F 0 is the free energy obtained by the trial Lagrangian L 0 while F 1 is the first order approximation which becomes the GEP when optimized. The two inequalities tell us that the expansion must be truncated at first order for a genuine variational approximation. Here and in the next two sections, when not specified, we will deal with the effective potential and with the renormalization of the vacuum graphs at zero temperature. The thermal corrections are finite and do not require any further renormalization.
Since we are interested in the massless Yang-Mills theory, we set m B = 0 in the interaction Eq.(3) and study a massless scalar theory as a toy model for the problem of mass generation. The vertices of the theory can be read from L int in Eq.(3) where we set m B = 0 and are used in Fig.1 in the vacuum graphs. The usual four-point vertex −λ is accompanied by the counterterm δΓ = m 2 that is denoted by a cross in the graphs. This counterterm must be regarded as part of the interaction so that the expansion is not loopwise and we find one-loop and twoloop graphs summed together in the first-order effective potential. That is where the non-perturbative nature of the method emerges since the expansion in not in powers of λ but of the whole interaction L int . The zeroth order (massive) propagator ∆ m follows from L 0 and is shown as a straight line in the vacuum graphs. The tree term is the classical potential and vanishes in the limit ϕ → 0. The first one-loop graph in Fig.1 gives the standard one-loop effective potential, containing some effects of quantum fluctuations. It must be added to the second one-loop graph in Fig.1, the crossed graph containing one insertion of the counterterm.
It is instructive to see that the exact sum of all oneloop graphs with n insertions of the counterterm gives the standard vacuum energy of a massless particle. In other words, if we sum all the crossed one-loop graphs the dependence on m disappears and we are left with the standard one-loop effective potential of Coleman and Weinberg [72] 1L is the standard one-loop effective action at ϕ = 0 and ∆ −1 0 = p 2 is the free-particle propagator of a massless scalar particle. Up to an additive constant, not depending on m, Eq.(13) can be written as then expanding the log we obtain a massive expansion that is shown pictorially in Fig.2 as a sum of crossed oneloop vacuum graphs. While the sum cannot depend on m, if we truncate the expansion at any finite order we obtain a function of the mass parameter. As a test of consistency, one can easily check that, once renormalized as described below, the sum of all the crossed one-loop vacuum graphs in Fig.2 gives zero exactly. The calculation of the GEP requires the sum of only the first two terms of Eq.(15), the two one-loop graphs in Fig.1. We cannot add higher-order terms without spoiling the variational method since the average value of the Hamiltonian in the trial state |m is E 1 (m) = V 1 (m), according to Eq.(5). Using the identity the sum of one-loop graphs in Fig.1 can be written as where K(m) and J(m) are defined as and because of Eq. (16), satisfy the identity At T = 0 they can be written as explicit diverging integrals to be regularized in some renormalization scheme. At finite temperature Eq.(19) still holds, but the integrals acquire a finite additive thermal part.
We recognize K(m) as the standard one-loop effective potential of Coleman and Weinberg for a massive scalar particle in the limit ϕ → 0. This term contains the quantum fluctuations at one-loop. The second term in Eq.(17) is a correction coming from the counterterm and arises because the exact Lagrangian was massless.
The calculation of the GEP also requires the two-loop graph in Fig.1 which is first-order in λ. It can be recovered from the crossed one-loop graph by just substituting the vertex −m 2 with the seagull one-loop self energy Σ 1L that reads [57] and adding a 1/2 symmetry factor. The resulting twoloop term is The GEP follows as the sum At this stage we have just recovered the GEP in the limit ϕ → 0 and Eq.(23) agrees with the well known GEP in that limit [46,56,57,59,60]. More precisely, V G is the GEP when m is optimized by the stationary condition Eq.(6) that reads yielding the usual gap equation of the GEP From a mere formal point of view, if Eq.(25) has a nonzero solution, the GEP predicts the existence of a mass for the massless scalar theory. That is of special interest because for m B = 0 the Lagrangian in Eq.(1) has no energy scale, just like Yang-Mills theory and QCD in the chiral limit. Thus, it can be regarded as a toy model for the more general problem of mass generation and chiral symmetry breaking.

III. RENORMALIZATION OF THE GEP
The scalar theory has been studied by many authors in the past, using different regulators, ranging from the insertion of a cut-off to dimensional regularization and, of course, to lattice regularization. The resulting physical theories are not always equivalent and the problem of triviality is still not totally solved. The issue is quite subtle and has to do with the physical meaning that we give to the theory in a four dimensional space. The regularization of the GEP has also been addressed by many methods [44,46,[60][61][62][63][64]. The most intuitive way of regularizing the integrals is by inserting a large but finite cutoff Λ which provides the physical units of the theory, as in lattice calculations where the finite lattice spacing a cuts the energies larger than Λ ∼ 1/a. In the Euclidean space, the integral J reads and is a finite positive-definite function of the mass parameter. The gap equation, Eq.(25), has a well defined solution at m 2 = m 2 0 = c λ λΛ 2 /(32π 2 ) where c λ is a coefficient of order unity, with 0 < c λ < 1 and c λ ≈ 1 in the limit λ → 0. Since the derivative is negative for any value of m 2 , the derivative of the effective potential in Eq.(24) changes sign at m = m 0 and becomes positive for m > m 0 . Thus, the GEP has an absolute minimum at m 0 and the simple cut-off regularization predicts a mass. The existence of a minimum at m = m 0 > 0 makes sense when compared with the data of lattice simulations that predict the existence of a finite mass in the limit m 2 B → 0 + of the unbroken-symmetry theory [73]. However, that mass is not a dynamical mass and arises from the quadratic divergence of J because no special symmetry protects the theory. That is not a desirable feature in a toy model for Yang-Mills theory since BRST invariance, which is not broken on the lattice, forbids the appearance of diverging mass terms. In that context, dimensional regularization is the first choice since it leaves BRST unbroken and is the simplest and usual way to cancel the quadratic divergence.
Having set d = 4 − ǫ, in the limit ǫ → 0 the integral J is redefined as Jµ ǫ where µ is an arbitrary scale of the order of m and expanding in powers of ǫ Integrating Eq. (19) and neglecting an integration constant (that does not depend on m) In the usual approach of Coleman and Weinberg [72], the divergences are absorbed by the (infinite) integration constants that are traded as finite and physical renormalized parameters. Following that approach, we could hide the poles in the definition of an energy scale Λ ǫ such that and write the integrals K, J as simply as If Λ ǫ were traded as a finite unknown energy scale, then the regularized expressions of J and K would be finite. Let us investigate the limits of Eq.(31) when the definition of Λ ǫ , Eq.(30), is taken literally, in the attempt to give it a physical meaning. While ǫ might even be a complex variable and the physical meaning of the poles is quite obscure in general, Eq.(30) only makes sense if we assume that ǫ is real, at least. Moreover, the expansion can only be trusted if |ǫ log(μ 2 /m 2 )| ≪ 1 which is equivalent to say that Thus, if we literally assume to work in a (4 ∓ |ǫ|)-dimensional spacetime, Eq.(31) holds asymptotically for a very small or a very large mass compared to Λ ǫ . The energy scale Λ ǫ can be regarded as a very large UV cutoff or a very small IR cutoff, according to the sign of ǫ. In both cases, we must face the non-intuitive result that the regularized J and its derivative change sign according to the value of m: for m ≪ Λ ǫ the integral J is negative while for m ≫ Λ ǫ the derivative of J becomes positive, which is at odds with the intuitive result obtained by a simple cutoff in Eqs. (26), (27). Actually, we must recognize that dimensional regularization is not neutral but its way to make sense of divergences is part of the physical interpretation of a field theory, with scaleless integrals that vanish exactly and a less marked difference between UV and IR divergences. Moreover, the use of dimensional regularization is controversial in the scalar theory and different physical theories seem to arise when the limit d → 4 is taken from above (d > 4) or below (d < 4), as first pointed out by Stevenson [62] in 1987. While it is still not obvious if any of them describes the latticeregulated scalar theory, they could be very relevant for our toy model of Yang-Mills theory. After reviewing them briefly, we will show how a dimensional regularization scheme can be set up for the variational effective potential of Yang-Mills theory.
A. The autonomous theory (d < 4) The autonomous renormalization of scalar theory [46,61] can be easily recovered by dimensional regularization for d < 4 [62,64]. It shows spontaneous symmetry breaking and asymptotic freedom but cannot be connected, perturbatively, to the usual low energy phenomenology that emerges by perturbation theory and 1/N expansion [62,63].
The search for a minimum of the GEP yields the coupled equations [61,64] where ϕ 0 is the optimal average value of the scalar field that would eventually break the symmetry if a solution exists. In that case, the other stationary point at ϕ = 0 is a maximum where Eq.(25) holds. If the symmetry is broken Eq.(25) is replaced by the second of Eqs. (33), which has the opposite sign and has a physical solution if ǫ → 0 + (d < 4). In fact, using the first of Eqs.(31), the new gap equation reads where α = λ/(16π 2 ) is a bare effective coupling and Λ ǫ → ∞ in the limit ǫ → 0 + so that α → 0 + is positive. The solution m 0 of the gap equation can be regarded as a physical scale which breaks the symmetry according to the first of Eqs. (33). Assuming that m 0 takes some fixed phenomenological value, the large scale Λ ǫ can be eliminated as so that the theory shows asymptotic freedom. Inserting the explicit expressions of J and K in the effective potential, the GEP at its minimum is [61,64] and Λ ǫ can be sent to infinity (ǫ → 0 + ) yielding a finite energy density, spontaneous symmetry breaking and a finite physical mass m 0 .
At variance with perturbation theory, in principle, the variational method does not require the use of a renormalized coupling. However, it is useful to parametrize the gap equation in terms of a finite running coupling α µ which can be defined according to [60,64] where µ is any finite scale. The gap equation, Eq. (34), is written as a finite renormalized gap equation where m 0 is assumed to be the physical RG invariant mass. As a toy model of Yang-Mills theory, we assume that α µ > 0, so that µ must be larger than m 0 and the running of α µ takes place in the UV sector, limited from below by the Landau pole at µ = m 0 . The beta function is negative and the running coupling shows asymptotic freedom. A plot of the coupling α µ is shown (up to a factor) as a solid line on the right side of Fig. 3. The breaking of symmetry and the existence of a mass scale seem to reverse the usual trivial behavior of the scalar theory. The autonomous behavior is separated from the usual weak coulpling limit which is observed below the Landau pole. However, we must mention that Eq. (38) is just a possible reparametrization of Eq.(34); it is not necessary, since the effective potential is anyway RG invariant at its minimum; and besides, the parametrization is not unique. It has some features that make it a good candidate as a physical renormalized coupling at the scale µ: in fact, reversing Eq.(37) it can be written in the perturbative weak coupling limit as α µ = α[1 + O(α)] and α µ → α in the UV limit µ → Λ ǫ . But, it is not obvious how α µ is related to the four-point function at the scale µ. Moreover, the parametrization is not unique: the existence of a RG invariant energy scale m 0 allows us to define a generic scale and a different running coupling α ′ µ according to yielding by Eq.(34) the finite equation Thus, the coefficient of the beta function is somehow arbitrary and we do not expect that any serious prediction can be made without an explicit calculation of the fourpoint function. Quite interesting, the exponent ν can be taken negative, inverting the sign of the beta function. However, assuming that α ′ µ > 0, we obtain µ < m 0 if ν < 0. The negative beta function would be defined below the Landau pole, and the new parametrization would describe the IR sector of the theory showing the same behavior that is predicted by perturbation theory and 1/N expansion: an increasing running coupling and triviality. For a negative ν, a plot of α ′ µ is shown as a solid line on the left side of Fig. 3. We observe that if ν < 0 then Λ ′ ǫ → 0 in the limit ǫ → 0 + when Λ ǫ → ∞. Let us consider the special case ν = −1 and call δ ǫ = Λ ′ ǫ in order to make clear that it is an infinitesimal IR scale, δ ǫ → 0. Eq.(34) can be written as which has the same identical content as before, but in terms of the IR vanishing scale δ ǫ = m 0 exp(−1/α). Thus the same theory now looks trivial. It is important to see that different parametrizations for ν = ±1, predicting opposite beta functions, refer to different ranges of µ, separated by the Landau pole. Thus the respective weak coupling limits cannot be connected by perturbation theory, yielding a double-valued beta which is legitimate when the running coupling is not a monotone function. In fact, joining together the outcome of Eq.(41) for ±ν we obtain which holds for any µ = m 0 , as shown in Fig. 3 where |ν| is arbitrarily chosen to match the strong coupling α s at µ = 2 GeV.

B. The precarious theory (d > 4)
Despite its name, the precarious renormalization of scalar theory [46] predicts the same phenomenology of perturbation theory and 1/N expansion [63]. Its handling by a cut-off is problematic since it seems to be unstable until the cut-off is sent to infinite. It emerges in a natural and straightforward way by dimensional regularization in d > 4, as first shown by Stevenson [62].
In the limit ǫ → 0 − , the energy scale Λ ǫ goes to zero according to Eq. (30). Let us call it δ ǫ in order to make clear that δ ǫ = Λ ǫ → 0. In the same limit, the coupled equations for the minimum of the GEP, Eqs. (33), have no solution because the bare coupling α would become negative in Eq. (34). There is no spontaneous symmetry breaking and the minimum of the effective potential is at ϕ = 0. At that point, having ruled out the breaking of symmetry, Eq.(25) holds and can be written as which has the opposite sign of Eq.(34). In the limit δ ǫ → 0 the bare coupling α is positive and an acceptable solution m 0 exists. As before, we assume that m 0 is a RG invariant physical mass which is generated dynamically in the massless theory. Thus, the small energy scale δ ǫ can be eliminated as δ ǫ = m 0 exp(−1/α) in the effective potential. We observe that Eq. (44) is identical to Eq. (42), and the theory appears as trivial. At its minimum ϕ = 0, the effective potential is given by Eq.(23) and inserting the regularized expressions of the integrals J, K, as given by Eqs. (31)  and is shown in Fig. 4. The only physical point is the absolute minimum at m 2 = m 2 0 where the effective potential does not depend on the bare coupling α and takes the value Then we can safely send ǫ → 0. We obtain the same identical vacuum energy that was found in Eq.(36) by the autonomous renormalization in d < 4, but here the mass m 0 is generated without any symmetry breaking. We observe that the stationary point m 0 is the physical mass that emerges as the pole of the self-consistent propagator. Actually, up to first order, the self-energy is the sum of the tree-level counterterm −m 2 and the seagull graph Σ 1L in Eq. (21), so that the self-consistency condition m = m 0 is equivalent to the vanishing of the first-order self energy [57] which is just the stationary condition Eq.(25) satisfied by m 0 . As discussed for d < 4, we do not need to introduce any running coupling in the variational calculation, because the effective potential is finite in units of m 0 . However, it might be useful to reparametrize the gap equation by a finite running coupling α µ which can be defined as before [62] where µ is an arbitrary energy scale. The gap equation, Eq. (44), is then written as a finite renormalized gap equation where m 0 is the physical RG invariant mass. Since we assume that α µ > 0, here µ must be smaller than m 0 and the running of α µ takes place in the IR sector, below the Landau pole at µ = m 0 . While we could deduce, naively, that the theory is trivial and the beta function is positive, again we must recognize that the parametrization is not unique and the running of α µ is limited in the IR sector. In fact, Eq.(50) is identical to Eq.(41) for ν = −1 and the present theory gives the same running predicted by the autonomous theory in the IR sector. Again, the existence of the RG invariant mass m 0 allows us to define a new energy scale and a different running coupling α ′ µ according to yielding by Eq.(44) the finite equation Joining together the outcome of Eq.(53) for ±ν we obtain the same identical result of Eq.(43) which holds for any µ = m 0 and is shown as a solid line in Fig. 3. We conclude that, up to an unknown factor ν, the beta function might have the same behavior in both renormalization schemes.

C. A toy model for Yang-Mills theory
When regularized dimensionally, two different renormalized theories seem to emerge in the limit d → 4. However, for many aspects, the two renormalized theories appear as two sides of the same coin. Both theories share a dynamical mass generation, the same vacuum energy density, a Landau pole at µ = m 0 and can be parametrized by the same running coupling which is not monotone, showing asymptotic freedom in the UV and a trivial Gaussian fixed point in the IR.
In both cases the Landau pole that emerges in the reparametrization has no effect on the effective potential which is RG invariant and is valid at any energy scale. Actually, at variance with perturbation theory, the variational method does not even require the use of a running coupling. However, the existence of the pole says that the two weak-coupling limits cannot be connected by perturbation theory which must break down at the scale µ ≈ m 0 . In fact, by general arguments, perturbation theory predicts that the beta function must be unique at the lowest orders of approximation and cannot depend on the special regularization scheme. But, if the running coupling is not a monotone function, a double valued beta function is found, taking different (opposite) values in different sectors that cannot be connected by perturbation theory. That scenario is only compatible with the existence of a RG invariant phenomenological energy scale where perturbation theory breaks down.
If we look at the strong coupling α s of Yang-Mills theory in the Taylor scheme, a non-monotonic behavior is found in the Landau gauge on the lattice [4], assuming that the ghost-gluon vertex is regular and a running coupling can be defined from the product of the dressing functions of two-point correlators. Some lattice data of Ref. [4] are shown in Fig. 3 together with the analytical prediction of Ref. [39], obtained by a one-loop massive expansion around the zeroth-order Gaussian propagator (−p 2 + m 2 0 ) −1 with m 0 = 0.73 GeV. The energy µ ≈ 0.7 GeV, where the coupling reaches its maximum, is the phenomenological scale where perturbation theory breaks down. Somehow, the running coupling α µ of Eq.(43) can be seen as a zeroth-order Gaussian approximation for the strong coupling α s (µ) of Yang-Mills theory. Actually, that is no coincidence since a gauge invariant effective potential will be derived in the next section for Yang-Mills theory, which is exactly the same GEP of Eq.(46) and Fig. 4, apart from a normalization factor and the precise definition of the effective coupling α. Thus, irrespective of the agreement with the latticeregulated scalar theory, the dimensional-regulated GEP of scalar theory is a useful toy model for pure Yang-Mills theory.
The two scalar theories only differ because of the breaking of symmetry which appears for d < 4; while, for d > 4, a dynamical mass generation occurs without any symmetry breaking. Since gauge symmetry is not broken in Yang-Mills theory, we expect that the correct phenomenology can only be reproduced if we adopt the second scheme and regularize the theory keeping d > 4.

IV. GEP AND MASS GENERATION IN SU(N ) THEORY
The Lagrangian of pure SU(N ) Yang-Mills theory can be written as where L Y M is the Yang-Mills term L f ix is a gauge fixing term and L F P is the ghost Lagrangian arising from the Faddev-Popov determinant. In terms of the gauge fields, the tensor operatorF µν iŝ and the generators of SU(N ) satisfy the algebra with the structure constants normalized according to If a generic linear covariant gauge-fixing term is chosen where ξ > 0 is an arbitrary positive number, the total action can be written as S tot = S 0 + S I where the freeparticle term is and the interaction is with the usual local interaction terms that read In Eq. (61), ∆ 0 and G 0 are the standard free-particle propagators for gluons and ghosts and their Fourier transforms are Here the transverse and longitudinal projectors are defined as where g µν is the metric tensor.
As discussed in Refs. [39,40], an unconventional massive expansion can be introduced by adding and subtracting mass terms δS i in the total action, just like we did for the scalar theory in Eqs.(2),(3). The method can be generalized by redefining the free and interacting parts of the action For the gluon we can take where the vertex function δΓ µν is given by a shift of the inverse propagator and ∆ m µν is the massive free-particle propagator As a general variational ansatz, the two masses m and m L can be different.
In principle, we would also have the freedom to insert a mass shift δS gh for the ghost together with its counterterm δΓ where G M would be a massive ghost propagator One could wonder if the inclusion of a mass parameter in the trial ghost propagator could shift the pole of the ghost at one-loop, yielding a phenomenological mass which would be at odds with the lattice data for the dressed ghost propagator. However, in the massive expansion of the propagators [39,40] the counterterm cancels the shift at tree level and any real mass term can only arise from loops. That is the reason why no mass would arise for the photon in QED by the same method. It can be easily shown [71] that the ghost self energy is of order O(p 2 ) and vanishes when the external momentum p → 0, so that the dressed ghost propagator still has a pole at p 2 = 0. That is an other way to see that the gluon mass arises from gluon loops in the expansion and is not a mere shift by a mass parameter. The case of a finite ghost trial-mass M > 0 has been explored in Ref. [74] and found to be sub-optimal when compared with the standard choice of a massless ghost. Then, we will assume M = 0 in the present variational study. It must be mentioned that, if the ghost mass M were regarded as an independent variational parameter, then its stationary point would be at M = 0 because there are no ghost-gluon vertices in the first order effective potential. Actually, the ghost contribution would be maximal at that stationary point, because of the wrong sign of ghost statistics. However, as discussed in the next section, in the more general context of the finite temperature formalism, a maximal ghost energy minimizes the eventual weakening of Jensen-Feynman inequality that might occur in non-Abelian theories. While that weakening cannot be avoided entirely, we will suggest a rigorous way to control the error on the variational bound. Let us take aside the problem for a while and assume that the GEP can be trusted as a variational method.
Since we have not changed the total action at all, we know that the sum of all graphs contributing to the longitudinal gluon polarization must give zero, because of gauge invariance. Thus, the exact longitudinal part of the gluon propagator must be equal to the free longitudinal propagator ∆ L 0 (p) = ξ/(−p 2 ). While, in principle, m L could be used as a variational parameter, we expect that the best result is achieved if the trial ∆ L m is taken to be equal to the exact ∆ L 0 by setting m L = 0 in Eq.(69). Having set M = m L = 0, the variational ansatz becomes the same that was used in the massive expansion of Refs. [39,40,42] where no ghost and longitudinal masses were inserted. Only the pole of the transverse free-particle propagator is shifted and compensated by inserting a transverse counterterm among the vertices of the interaction, while the gaugedependent longitudinal part of the gluon propagator is left unchanged and equal to the exact result. As shown in Ref. [42], that massive expansion is in very good agreement with the data of lattice simulations. Moreover, that choice of counterterms has the merit of providing a fully gauge invariant GEP at T = 0, as shown below. The calculation of the GEP follows the same steps as for the scalar theory. The GEP is obtained as the firstorder effective potential in the covariant formalism, including the counterterms among the interaction vertices and in the limit of a vanishing background field, i.e. assuming that A aµ = 0 since gauge symmetry is not broken in the vacuum. The effective action reads e iΓ(a) = 1P I D A,ω e iS0(a+A,ω)+iSint(a+A,ω) (74) and the effective potential follows as V = −Γ(0)/V 4 and is the sum of all connected 1PI vacuum graphs. The first order graphs contributing to the GEP are shown in the second row of Fig. 1. The zeroth order gluon and ghost loops in Fig. 1 give The determinant of ∆ µν m can be split as the product of determinants in the orthogonal Lorentz subspaces, (76) where d = 4 in a four dimensional space-time.
The constant gauge dependent (infinite) term Tr log ξ is canceled by an equal factor in the normalization of the Faddeev-Popov functional, so that using ∆ L 0 /ξ = −G 0 , one-half of the ghost cancels the longitudinal term yielding where N A = N 2 − 1.
The crossed one-loop graphs in Fig. 1 are obtained by one insertion of the counterterms. Since there are no ghost and longitudinal counterterms, there is only one crossed loop for the transverse gluon. The identity Eq.(16) changes its sign for ∆ T m and inserting the counterterm of Eq.(73) the sum of all one-loop graphs (zeroth and first order) can be written as which reads The functions K(m) and J(m) were defined in Eq. (20) and their explicit regularized expression were given in Eq. (31). The formal result of Eq.(79) is gauge invariant and also valid at finite temperature, since Eq.(16) still holds when the integrals K, J acquire a thermal part. The first-order effective potential also includes the twoloop gluon graph in Fig. 1. For m L = 0 each loop of the longitudinal propagator contributes a factor ξJ(0) which is zero by dimensional regularization, so that the twoloop term is also gauge invariant at T = 0. The same identical expression would be obtained in Landau gauge (ξ = 0) if m L > 0. The calculation is formally different in the finite temperature formalism and will be studied in the next section. Here, we examine the vacuum part that contributes to the GEP at T = 0 and is relevant for discussing the issue of mass generation. Inserting the seagull one-loop graph [71] the two-loop term reads Setting d = 4 and adding the one-loop term of Eq.(79), in terms of the new effective coupling α a gauge invariant GEP is found that can be written as having dropped the constant K(0) which is zero at T = 0. That is the same identical result obtained in Eq.(23) for the scalar theory, provided that the effective coupling α is replaced by λ/(16π 2 ). Thus, using the same dimensional regularization scheme of Section III and keeping d > 4, the renormalized GEP of Eq.(46) is recovered in units of the optimal gluon-mass parameter m 0 . Inserting the correct normalization factor, the GEP reads and was shown in Fig. 4. That figure shows the existence of two competing stationary points for the vacuum: an unstable stationary point at m = 0 and a stable minimum at m = m 0 . The existence of a stable massive vacuum is a remarkable non-perturbative prediction of the present variational method and can be regarded as an argument for mass generation in pure Yang-Mills theory. We are tempted to identify the unstable stationary point at m = 0 with the massless scaling solution of Schwinger-Dyson equations. That solution is not found in lattice simulations.
In the next section, we will show that the two stationary points acquire a very different behavior at finite temperature. The massless vacuum at m = 0 develops a thermal mass that increases with temperature like for a standard massless boson, while the minimum at m = m 0 shows a decrease of the mass until a weak first order transition occurs before the merging of the minima.
As shown in Fig. 4, when written in physical units of m 0 , the renormalized GEP is not very sensitive to the actual value of the strong coupling α s , especially at the stationary points that might be identified as physical configurations. Thus everything seems to be settled by the physical scale m 0 , while the coupling α s must be regarded as a bare coupling at the scale Λ ǫ according to our renormalization scheme discussed in Section III. Its actual value should be almost irrelevant and will be fixed by the principle of minimal sensitivity [66] as the stationary point of the critical temperature.
Since there is no scale in the original Lagrangian, the actual value of the mass m 0 cannot be predicted by the theory and must come from the phenomenology. The massive expansion of Refs. [39,40] arises as the natural expansion around the best trial massive vacuum at m = m 0 . By that expansion, at one loop, the gluon propagator was found in perfect agreement with the data of lattice simulations [42] in the Landau gauge. The inverse dressing function, which is basically given by the gluon self-energy, is determined without any free parameter and is not monotone, with a pronounced minimum that allows us to fix the energy scale with good accuracy. As shown in Fig. 3, the one-loop analytical expression for the running coupling reproduces the lattice data very well. Sharing the same units of the lattice data in the Landau gauge, the scale m 0 = 0.73 GeV is extracted for N = 3 [39,42]. We will use that scale in the next sections.

V. THE GEP AT FINITE TEMPERATURE AND DECONFINEMENT
At finite temperature, supposing that Jensen-Feynman inequality Eq.(11) holds, the first-order free energy is bounded below by the exact free energy F(T ) that can be expressed as where the thermal action is the integral over imaginary time defined in Eq.(9). If we split the action as in the previous section, inserting the mass term Eq.(67) in the free part and the counterterm Eq.(73) among the vertices, the free energy in Eq.(85) is expanded by the same formal massive expansion as before. The first-order approximation F 1 (T, m) depends on the mass parameter m and is given by the same graphs in the second row of Fig. 1. When optimized it gives the GEP, while the optimal value of m that minimizes F 1 (T, m) provides the best trial mass parameter m(T ) at finite temperature, so that m(0) = m 0 .
In non-Abelian theories, the GEP might be bounded below by an approximate free energy rather than the exact free energy. Actually, the existence of ghosts in the covariant formalism and the appearance of states with negative norm in the Hamiltonian formalism might limit the use of Jensen-Feynman inequality Eq. (11) and Bogolubov's inequality Eq.(10), respectively, unless we have some physical evidence about the safe cancellation of the unphysical degrees of freedom in the averages. However, we can show that a weaker form of Jensen-Feynman inequality still holds for the GEP.
The partition function in Eq.(85) can be written as where M F P (A) is the Faddev-Popov matrix, which is linear in the field A µ a , and S ′ is the original total action without any ghost term, obtained by setting ω a = 0 in the sum S 0 + S int . We can also define zeroth order free energy F ′ 0 and partition function Z ′ 0 without ghost terms as where S ′ 0 is the quadratic part of S ′ , including the gluonmass term. The exact free energy F exact follows as where S ′ int = S ′ − S ′ 0 and the average over A µ a is defined according to In Eq.(88), we can use Jensen inequality in the pure bosonic average of the convex exponential function and write where is the sum of all first-order gluon graphs in the second row of Fig. 1 and gives the gluon contribution to the firstorder free energy, while F gh is a ghost free-energy given by which is different from the sum of all first-order ghost graphs F gh 1 contributing to the GEP in Fig. 1. If the ghost term F gh were known exactly, then its sum with the gluon first-order term F ′ 1 would provide through Eq.(90) a pure variational approximation, bounded below by the exact free energy.
We can loop expand F gh by inserting the explicit form of the matrix M F P . In any linear covariant gauge where the massive ghost propagator was defined in Eq. (72) and takes account of a generic shift of the pole, while δM(A) is the sum of the ghost vertex of L gh in Eq.(63) (proportional to the gauge field A µ a ) and the ghost counterterm δΓ of Eq. (71). Expanding the log we obtain βF gh = Tr log G M − Tr (G M δΓ) which is a sum of vacuum ghost graphs with insertions of the standard vertices. The first two terms of the expansion are just the first-order ghost graphs in Fig. 1 and give the ghost term F gh 1 contributing to the GEP. The third term is the two-loop graph which might be added to the first-order terms for improving the approximation, as discussed by previous work in the Lagrangian and Hamiltonian formalism [24,58]. We observe that, while the bound in Eq.(90) is exact, any arbitrary truncation of the expansion would invalidate it. Thus, there is no way to tell if adding the two-loop term would give a better result compared with the simple GEP where only the first-order terms are retained. Denoting by δF the difference between the exact ghost term and the first-order terms retained in the GEP We can write the exact bound in Eq.(90) as The GEP might actually fall below the exact free energy, but we can minimize the problem by maximizing the ghost term F gh 1 in the GEP, as suggested by Eq.(96). In fact, it can be easily shown that δF ≥ 0 and F gh 1 is bounded above by the exact ghost term F gh . By use of Jensen inequality in the average of the log in Eq.(92) and since F gh 1 is maximal at its stationary point M = 0, that point is also the safest choice that maximizes the ghost term without reaching the exact value F gh . Having shown that δF is positive, we could estimate its value by an explicit evaluation of the two-loop term in Eq.(95) in order to keep the approximation under control. We must mention that the GEP might be closer to the exact free energy than expected by the mathematical bound of Eq.(97) since δF is just the maximal error that we have been able to establish in the worst case. In fact, by a comparison with the data of lattice simulations, we will show that at finite temperature the GEP does very well, better than expected by the present analysis.
At finite temperature, the explicit calculation of the GEP follows by the graphs of Fig. 1. The sum of one-loop graphs is still given by Eqs. (79) where the integrals K, J in Eq.(18) now include a sum over discrete frequencies and their explicit expressions in Eq. (20) are replaced by having used in Eq.(18) the massive free propagator ∆ m (ω n , p) = 1 in the Euclidean space where p µ = (ω n , p) and ω n = 2πnT . In the limit T → 0 the vacuum integrals in Eqs. (20) are recovered as J(m) = J(0, m) and K(m) = K(0, m). We denote them by J V (m) and K V (m), respectively. They contain the diverging part of the integrals and can be regularized as discussed in the previous sections. Their explicit expression is given by Eqs. (31). The thermal parts are finite but depend on T . We denote them by J T (T, m) and K T (T, m), respectively. Omitting the arguments for brevity, they can be written by an explicit calculation as where ǫ k,m = √ k 2 + m 2 and n(ǫ) = [exp(βǫ) − 1] −1 is the Bose distribution.
The first-order free energy F 1 (T, m) can be written as the sum of one-loop and two-loop terms The sum of one-loop graphs is obtained by just setting d = 4 in Eq.(79) The second term F 2L (T, m) is the two-loop graph in the second row of Fig. 1. Because of the breaking of Lorentz invariance at finite T , its expression gets formally different than the vacuum term in Eq.(81) and also becomes gauge dependent. In order to make contact with previous analytical and numerical work in the Landau gauge we set ξ = 0, which is the most common choice for the study of the correlators, so that the scale m 0 = 0.73 GeV will be used. In fact, that scale was extracted by matching the predictions of the massive expansion with the data of numerical simulations in the Landau gauge [39,42]. Assessing the whole gauge dependence of the GEP at finite temperature is not an easy task, as the scale m 0 should be also changed by matching the gauge-dependent correlators in a different gauge.
Following the same steps of the previous sections, in the Landau gauge, the seagull graph of the gluon self energy can be written as [71] is the Euclidean propagator in Eq.(100). Integrating the single terms, it can be written as where The trace of I µν is I µµ = J, so that at T = 0, by Lorentz invariance, the self energy of Eq.(80) is recovered for d = 4. At finite temperature, I µν is still diagonal but I 00 = I ii . By rotational invariance, using the trace again, we can write which holds separately for the thermal and vacuum parts. While the vacuum part is just I 00 V = I ii V = J V /4, the thermal part can be obtained by an explicit integration as where h m is the integral that can be evaluated exactly for m = 0 yielding Closing the second loop with the transverse gluon propagator (ξ = 0) and inserting the symmetry factor 1/4 Then, using Eq.(105), the two-loop term reads The GEP is shown in Fig. 5 for different values of the temperature and in Fig. 6 for several values of the coupling α s . As already discussed in the previous sections, the GEP is not very sensitive to the coupling, especially in the physical ranges around the minima and for T < 2T c ≈ 0.5 GeV. While the physical value of the GEP was not sensitive at all to a change of α s at T = 0, other observables, at finite temperature, might depend on α s because the variational method is not an exact calculation. In lattice simulations, the bare coupling and the cutoff are finite, since the lattice spacing cannot be set to zero. However, a stationary regime is reached where the physical predictions seem to be not sensitive to the actual value of the bare coupling. In the present calculation, because of the approximations, we fail to reach an exactly stationary regime for all the thermal observables. Albeit small, a residual sensitivity to the bare α s is found, posing the problem of the choice of the coupling. We argue that, for any finite value of coupling and cutoff, the outcome of the variational calculation is more reliable and closer to the lattice data if the physical observables are less sensitive to the arbitrary value of the bare coupling. Thus, the best agreement with the data of lattice simulations is expected in the range 0.6 < α s < 1.2 where a real plateau is observed, rather than in the limit α s → 0 where a slightly larger sensitivity is found. For that reason, even if α s should be sent to zero in the limit ǫ → 0, we prefer to keep α s fixed at the optimal value α s = 0.9 in the following discussion and in the comparison with the lattice data. We checked that any other choice does not introduce important changes in the results.
At finite temperature, we observe that the minima of the GEP have a very different behavior. The absolute minimum at m = m 0 is almost frozen when T ≪ m 0 , as expected for a massive confined gluon. When the temperature increases the minimum moves backwards, so that the optimal mass parameter m(T ) is a decreasing function of the temperature, in fair agreement with the decrease of mass that is observed on the lattice below T c [10]. The unstable minimum, at m = 0 in Fig. 4, moves forward when T > 0 and its mass value increases almost linearly like the thermal mass of a massless boson. It gets deeper with increasing temperature. Thus the GEP seems to show the competition between a confined boson with a dynamical mass and a free boson with a thermal mass. As shown in Fig. 5, at a critical temperature T c ≈ 0.35m 0 the minima reach the same free energy before they can merge, so that a weak first-order phase transition is predicted with a discontinuous drop of the optimal mass parameter m(T ) that is displayed in Fig. 7. The free energy at the minima is shown in Fig. 8 across the transition. Below the transition point, the upper curve is the GEP at the unstable thermal mass, while the lower curve is the GEP at the stable dynamical mass. Above the transition point they reverse. At any temperature, the physical free energy is the lower curve F G (T, m(T )).
The slight effect of a change of α s on the critical temperature is less than ±1% in Fig. 9, where it is shown at a very enlarged scale. Apart the effect of the scale, the critical temperature is basically unchanged for a large range of α s , including the phenomenological interval 0.4 < α s < 1.2 which would be ranged by a running coupling in the IR. The plateau has a stationary point at α s ≈ 0.9 where T c = 0.349 m 0 . We take that as the best prediction of the GEP according to the principle of minimal sensitivity [66].
Using the scale m 0 = 0.73 GeV that arises for N = 3 from the massive expansion at one-loop [37][38][39][40][41][42], we predict T c = 255 MeV, which is very close to the value T c = 270 MeV that is found on the lattice [10].
It is important to mention that if the bare coupling were sent to zero in the limit ǫ → 0, the resulting qualitative picture would remain basically unchanged. In the limit α s → 0, the deconfinement transition still takes place, is weakly first order and with a critical temperature T c ≈ 0.32 m 0 not too far from that found on the plateau. The only relevant difference is in the behav-ior of the unstable minimum, whose position does not change with the temperature and remains fixed at m = 0 for every value of T , even if it gets deeper and eventually becomes the stable minimum above T c . Thus, in the limit α s → 0, the optimal mass parameter is m ≈ m 0 for T < T c , and m = 0 for T > T c . In the same limit, the critical temperature can be estimated by observing that the gluon thermal term is exponentially suppressed at m ≈ m 0 and cancels the opposite ghost term, so that the minimum of F G (T, m) is basically frozen at the vac-  The equation of state can be studied by introducing pressure and entropy density according to The reader might have noticed in Fig. 5 that below T c the minimum at m = m 0 moves slightly upwards. That behaviour gives an unphysical negative entropy for a limited range of temperatures, as reported by other massive approximations at one-loop [33,35] and by other variational methods [75]. That minor shortcoming might be expected since the contribution of the massless ghost is enhanced when T ≪ m compared to the massive gluon. The problem becomes more evident if we look at the ratio p/T 4 in the limit T → 0. That ratio should be exponentially suppressed and dominated by the lightest glueball mass, in agreement with the data of lattice simulations [76][77][78]. By inspection of Eq.(114), we observe that while the thermal functions K T , J T , I 00 T are exponentially suppressed, the second term on the right hand side contributes with the fourth power of T , originating from the massless ghost loop in Eq.(110) which, besides, is taken with the opposite sign. When all other terms are suppressed, the ghost loop dominates the leading behavior yielding a finite non-zero ratio in the limit and a negative entropy in the same limit. That seems to be a shortcoming of the Landau gauge, since the same identical finite values were found in Refs. [34,35] in that gauge. The same authors find smaller finite values and a positive entropy in the Landau-De Witt gauge by a twoloop calculation. As discussed in Ref. [75], one would be tempted to cancel the unphysical term by hand, but that term gives an important contribution above the transition where it cancels unphysical gluon terms. On the other hand, the mismatch can only be observed below T c where the exact free-energy is almost constant and the pressure is basically zero, so that even a very small (positive) deviation can give an increasing freeenergy and a decreasing pressure. Actually, the effect can be hardly seen in Fig. 10 where the pressure of Eq.(115) is shown together with the recent lattice data of Ref. [76] which are consistent with previous existing data [77,78]. We observe that the figure is not a fit and that there are no free parameters in the calculation. Moreover, in units of T c the pressure in Fig. 10 does not even depend on the energy scale m 0 . Thus, it is remarkable that the data points fall so close to the prediction of the calculation, at least for T < 2T c . As shown in the figure, the GEP provides a pressure that seems to be bounded above by the data points, as expected if the GEP were bounded below by the exact free energy, suggesting that the error in the ghost free-energy δF might be very small in Eq.(97). For comparison, in Fig. 10 the pressure is also shown for a coupling α s = 0.6, smaller than the optimal value α s = 0.9. While the predictions are not sensitive to the choice of the coupling at low temperature, above 1.5 T c the pressure acquires a slight dependence on it and the agreement with the data improves by decreasing α s . The problem of a negative entropy becomes more evident in Fig. 11 where the entropy density of Eq.(115) is shown together with the lattice data of Ref. [76]. The small jump of the entropy density at T = T c is ∆s/T 3 c = 2.7 yielding a latent heat ∆H 0 = 2.7 T 4 c which is larger than the values 1.3 − 1.5 found in lattice simulations [76][77][78]. However, we expect that the overall picture of dynamical mass generation, deconfinement transition and equation of state might improve greatly by adding higher-order terms of the expansion in the free energy, as it is the case for the dressed propagator which gets on top of the lattice data when the one-loop terms are added to the zeroth-order massive propagator ∆ m = 1/(p 2 + m 2 0 ) [39,40,42].

VI. DISCUSSION
The self-consistency gap equation of the GEP, Eq. (25) has attracted a lot of attention in the past [19,20,45] as a basic physical tool for explaining the dynamical mass generation of Yang-Mills theories. The main difficulty of handling the gap equation has always been the regularization of the diverging integral J(m) and its physical meaning. Here, we have shown that, by dimensional regularization in d > 4, the GEP provides a reasonable account of the general features of Yang-Mills theory. The existence of a deep minimum at m = m 0 = 0 can be regarded as a variational argument for dynamical mass generation in the original scale-less theory.
In order to enforce our confidence on the genuine physical nature of the minimum, we explored the model at finite temperature. The emerging scenario for the equation of state and the deconfinement transition is in very good agreement with the data of lattice simulations, leaving no doubt about the physical interpretation of the minima in the GEP.
Moreover, the method provides a perturbative tool for improving the results order by order. The expansion around the optimal vacuum of the GEP turns out to be the massive expansion developed in Refs. [38][39][40] which provides accurate and analytical expressions for the propagators at one-loop already. Once the non-perturbative effects are embedded in the optimal variational mass, the residual interaction can be described by perturbation theory yielding a powerful analytical tool for QCD in the IR.
Thus, we argue that the present variational estimate of the thermodynamical potentials might be improved by inclusion of higher order terms. Second order extensions of the GEP have been discussed by several authors [56][57][58][59]. In general, they do not retain the genuine variational property of the GEP but different optimization strategies have been proposed ranging from the principle of minimal sensitivity [66] to the method of minimal variance [68][69][70][71]. Explicit massive two-loop thermal graphs have been evaluated in Ref. [35]. Here, we limited the calculation at the first order, just because we preferred to maintain the genuine variational nature of the method unspoiled, as much as Jensen-Feynman inequality allows in presence of ghost fields. Nevertheless, the pure GEP provides a remarkably good picture of the deconfinement transition. From first principles, without any fit parameter, the simple firstorder calculation predicts a weak first order transition at T c ≈ 250 MeV for N = 3, with a pressure which is very close to the data points of lattice simulations. We must mention that the method fails to predict a continuous transition for N = 2. That could be the consequence of a known issue for the GEP which usually predicts a weak first-order transition even when the transition is secondorder, e.g. for the scalar theory [46,56]. In that case, a continuous transition is restored by inclusion of second order terms [56]. Moreover, the GEP is known [43,63] to predict the correct N → ∞ limit of 1/N expansions, so that its reliability increases when N is large.
Finally, even if the present variational study is limited to the low temperature range T < m 0 , where no resummation of hard thermal loops is required because of the finite mass in the loops, the effects of a finite mass become negligible for large energies and T ≫ m 0 and the standard results of perturbation theory would be recovered by the massive expansion in that limit.