Photon emissivity of the quark-gluon plasma: a lattice QCD analysis of the transverse channel

We present results for the thermal photon emissivity of the quark-gluon plasma derived from spatially transverse vector correlators computed in lattice QCD at a temperature of 250 MeV. The analysis of the spectral functions, performed at fixed spatial momentum, is based on continuum-extrapolated correlators obtained with two flavours of dynamical Wilson fermions. We compare the next-to-leading order perturbative QCD correlators, as well as the ${\cal N}=4$ supersymmetric Yang-Mills correlators at infinite coupling, to the correlators from lattice QCD and find them to lie within $\sim10\%$ of each other. We then refine the comparison, performing it at the level of filtered spectral functions obtained model-independently via the Backus-Gilbert method. Motivated by these studies, for frequencies $\omega\lesssim2.5\,$GeV we use fit ans\"atze to the spectral functions that perform well when applied to mock data generated from the NLO QCD or from the strongly-coupled SYM spectral functions, while the high-frequency part, $\omega\gtrsim 2.5\,$GeV, is matched to NLO QCD. We compare our results for the photon emissivity to our previous analysis of a different vector channel at the same temperature. We obtain the most stringent constraint at photon momenta around $k\simeq0.8\,$GeV, for which we find a differential photon emission rate per unit volume of $d\Gamma_\gamma/d^3k = (\alpha_{\rm em}/(\exp(k/T)-1))\times (2.2 \pm 0.8 ) \times 10^{-3}\,{\rm GeV}$.

We present results for the thermal photon emissivity of the quark-gluon plasma derived from spatially transverse vector correlators computed in lattice QCD at a temperature of 250 MeV. The analysis of the spectral functions, performed at fixed spatial momentum, is based on continuum-extrapolated correlators obtained with two flavours of dynamical Wilson fermions. We compare the next-to-leading order perturbative QCD correlators, as well as the N = 4 supersymmetric Yang-Mills correlators at infinite coupling, to the correlators from lattice QCD and find them to lie within ∼10% of each other. We then refine the comparison, performing it at the level of filtered spectral functions obtained model-independently via the Backus-Gilbert method. Motivated by these studies, for frequencies ω 2.5 GeV we use fit ansätze to the spectral functions that perform well when applied to mock data generated from the NLO QCD or from the strongly-coupled SYM spectral functions, while the high-frequency part, ω 2.5 GeV, is matched to NLO QCD. We compare our results for the photon emissivity to our previous analysis of a different vector channel at the same temperature. We obtain the most stringent constraint at photon momenta around k 0.8 GeV, for which we find a differential photon emission rate per unit volume of dΓ γ /d 3 k = (α em /(exp(k/T ) − 1)) × (2.2 ± 0.8) × 10 −3 GeV.

I. INTRODUCTION
Photons and lepton pairs have long been considered to provide direct information on the quark-gluon plasma (QGP), since they are penetrating probes of the QGP due to their colourless nature [1][2][3][4][5]. During heavy-ion collisions, photons can escape the plasma without scattering via the strong interaction, but discriminating between different sources is quite challenging because of the continuous emission of photons during the spacetime evolution of the fireball. The detected photons are divided into two main categories: decay and direct photons. The former refers to photons coming from the electromagnetic decays of final state hadrons, while the latter includes all photons created in the collision before the final hadrons completely decouple. The decay photons give a much larger contribution to the signal than the direct photons and they provide valuable information for particle reconstruction. However, when studying direct photons, that contribution amounts to a large background and has to be subtracted from the total photon yield.
Direct photons are produced via several mechanisms during the evolution of a heavyion collision (c.f. [5]), but the two major sources are initial hard parton-parton scattering (prompt photons), and photons originating from the QGP (thermal photons). The latest direct photon yield results have been published by the PHENIX [6] and STAR [7] collaborations at RHIC, and by ALICE [8] at the LHC. The direct photon yield measured by the PHENIX and by the ALICE collaborations shows an excess at low transverse momentum, p T 3 GeV, with respect to the theoretical predictions [9,10]. The results of the STAR collaboration [7], however, are in better agreement with the model results. The excess observed in the PHENIX and ALICE results is in the momentum range where the dominant contribution comes from thermal photons [10].
An important ingredient in the theoretical prediction is the emission rate of thermal photons per unit volume, which has to be integrated over the expanding spacetime volume of the medium to obtain the total thermal photon yield [10]. This rate has been calculated at leading order in the strong coupling constant [11], and the calculation has been extended to include corrections which arise from interactions with soft gluons [12]. The thermal photon emission rate from this extended calculation has a similar functional form to the leading order one, and represents a 20% increase at α s 0.3. This modest increase reduces the tensions, though it is still insufficient to explain the excess observed by the PHENIX and ALICE collaborations and highlights the leading-order prediction may receive large corrections at relevant temperatures.
Therefore, a precise non-perturbative calculation of the thermal photon production rate using lattice QCD is highly desirable and would help to resolve the tensions or confirm the already existing weak-coupling results. Although lattice QCD has been very successful in calculating observables in vacuum as well as in thermal equilibrium, reliably accessing near-equilibrium quantities such as the thermal photon rate is very challenging, because the calculation of real-time observables requires analytic continuation using a finite, limited number of noisy Euclidean correlator data. Moreover, the Euclidean correlators that need to be inverted to obtain information on the spectral function are rather insensitive to the infrared features of these spectral functions [13][14][15][16]. In spite of the difficulties, several methods have been devised to constrain the ill-posed numerical inversion problem using both model-dependent and independent approaches.
To determine the thermal photon production rate via analytic continuation, the starting Euclidean observable is the vector current correlation function at finite temperature. Although this correlator has been investigated extensively also on thermal ensembles, earlier analyses focused mainly on vanishing spatial momentum, which is relevant for the determination of the electrical conductivity of the plasma [17][18][19][20][21][22]. Continuum-extrapolated vector current correlators at finite spatial momenta obtained in quenched lattice-QCD simulations were analyzed for the photon rate in Ref. [23]. In Ref. [24], the same goal was pursued with continuum-extrapolated vector correlators based on two-flavour dynamical simulations, focusing on an infrared-dominated channel, the difference between transverse and longitudinal channels (T − L channel). In both [23] and [24], information on the spectral function was extracted from the Euclidean data using fit ansätze.
In this work, we apply two different strategies to analyse the correlation functions: the Backus-Gilbert method [25,26] and a fit method where we apply various simple fit ansätze matched to perturbation theory at high frequencies. In contrast to our earlier work, we investigate the spatially transverse channel of the vector current correlator at finite spatial momentum. In Sec. II, we collect the relevant basic formulas and discuss the advantages of investigating this particular channel. We also present in that section an overview of the spectral function in this channel in different regimes. Details on the lattice configurations and the observables relevant in this study, including the continuum extrapolation are presented in Sec. III. The analysis of the continuum-extrapolated correlator in the transverse channel is presented in Sec. IV. There, we also present our estimate on the thermal photon rate based on this channel and compare it to available results from the literature. We finally give our conclusions and an outlook in Sec. V.

II. THEORETICAL BACKGROUND
In this section, we collect the main definitions of the correlation functions to be analyzed and describe the existing theory predictions with which we will confront our lattice QCD calculations.

A. Basic definitions
The spectral function of the vector current J µ =Ψγ µ T a Ψ, for a generic matrix T a acting in quark-flavour space, is defined as The Minkowskian time evolution of the current is given by The expectation value of the commutator is taken with respect to the thermal density matrix, e −βH /Z, with β = 1/T being the inverse temperature. To leading order in the fine-structure constant, α em = e 2 /(4π), letting T a = diag( 2 3 , −1 3 , −1 3 , . . . ) contain the quark electric charges, the thermal photon production rate per unit volume of quark-gluon plasma can be expressed as [27] dΓ γ (k) in terms of the vector channel spectral function ρ V (ω, k) = −ρ µ µ (ω, k), where k = |k|. It corresponds to the choice λ = 1 in the following linear combination, introduced in Refs. [24,28], where denote the spatially transverse and longitudinal spectral functions, respectively. As a consequence of current conservation, expression (3) is independent of λ for light-cone kinematics ω = k. Due to this fact, ρ V (k, k) can be replaced by ρ(k, k, λ) with arbitrary λ in the evaluation of the photon rate [24,28] of Eq. (2). In Refs. [24,29], the λ = −2 case was investigated, which corresponds to the difference of the transverse and longitudinal channels This channel, ρ(ω, k, −2), is particularly interesting because it is non-negative for 0 ≤ ω ≤ k, and highly suppressed when ω > k. Therefore, it is very sensitive to infrared physics of interest. It vanishes in the vacuum and also satisfies a superconvergent sum rule, demonstrated in Refs. [24,28] and utilized as a constraint in the spectral reconstruction from Euclidean correlators in Ref. [24]. In this work, we investigate the transverse channel, corresponding to using λ = 0 in Eq. (3). As we shall see below, it has complementary properties to the previously studied λ = −2 channel; in particular, its spectral function is non-negative for all ω ≥ 0. The transverse-channel Euclidean two-point functions of the vector current carrying a definite spatial momentum are related to their corresponding spectral function via (see e.g. [15]) where the kernel is given as and τ = it. This spectral representation of the Euclidean correlators will first be used to confront them with theory predictions for the spectral function, and, in a second stage, to fit an ansatz for the spectral function to the Euclidean correlators computed in lattice QCD. We remark that in Sec. III and in Sec. IV we use the flavour matrix T a = diag( 1 2 ) in two-flavour QCD, i.e. we calculate the isovector vector current correlator. This specifies in particular the normalisation of our results for G T , G 00 or ρ T . If we are willing to use the approximate SU(3) flavour symmetry in the high-temperature phase of QCD and neglect dynamical strange-quark effects as well as the charm contribution, the computed correlators simply need to be multiplied by the charge factor, C em = (2/3) 2 + (−1/3) 2 + (−1/3) 2 = 2/3, to obtain the electromagnetic current correlator in the physical QGP.

B. Hydrodynamics
The long-wavelength behavior of the spectral function can be studied with the help of hydrodynamics. Let D be the diffusion coefficient for the conserved electric charges. In the hydrodynamic regime ω, k D −1 , the first-order hydrodynamic prediction for the transverse channel spectral function is [30] ρ where χ s denotes the static quark susceptibility, The diffusion coefficient can be expressed using the electrical conductivity, σ, as D = σ/χ s . The functional form of ρ T in Eq. (8) reveals the advantage of investigating this channel, namely that it does not couple to the diffusion pole in contrast to ρ V or ρ(ω, k, −2) [23,24]. Therefore ρ T (ω, k)/ω around ω = 0 does not contain any peak-like structure of arbitrarily small width as k → 0. 1 At extremely high temperatures in QCD, which implies a small value of the strong coupling constant g, a kinetic theory treatment eventually becomes applicable and a narrow peak of width ∼ g 4 T appears [30].
Following Ref. [23], we define the effective diffusion coefficient which tends to D as k → 0 and is the key dynamical ingredient to evaluate the thermal photon production rate of Eq. (2).

C. Resummed spectral functions at NLO in thermal QCD
Analytic predictions are also available for the spectral functions in the weak-coupling limit. The spectral function in the timelike regime -which is relevant for dilepton production -has been investigated using perturbative calculations since the seminal works of Refs. [27,31]. In most of these perturbative calculations, the studied channel was the vector channel spectral function, ρ V . Focusing on the transverse channel, leading-order (LO) results (corresponding to non-interacting quarks) are available for massless quarks in e.g. Refs. [32][33][34]. Recently, this has been extended to NLO at O(g 2 ) calculating the perturbative contributions up to two-loops. The details of this impressive two-loop calculation -valid both for spacelike and timelike virtualities -can be found in Ref. [35], while a comparison to lattice correlators has been presented in Ref. [36].
As we discussed in Sec. II A, the relevant information for the thermal photon production is coming from the spectral function determined at the light-cone. The thermal photon emission rate vanishes for non-interacting quarks [33]. The NLO perturbative thermal photon emission rate can be either determined by evaluating the NLO spectral functions at the lightcone or by making use of the computation in Ref. [11]. The O(g 2 ) calculation of Ref. [11] has been extended to O(g 3 ) by taking into account contributions from soft gluons [12].
The strict two-loop perturbative spectral function develops a logarithmic singularity at ω = k [35,36]. It originates from multiple rescatterings of a quark taking part off-shell in the inelastic annihilation process that produces a photon or in bremsstrahlung [11,34,[37][38][39]. This infrared (IR) singularity is also present in the NLO calculation of the real photon rate [11,37]. The effect is called the Landau-Pomeranchuk-Migdal (LPM) effect [11,37], and can be handled by implementing a proper resummation of ladder diagrams, called the LPM resummation [11,34,36,38].
In order to compare lattice and perturbation theory results, we used the publicly-available implementation of the two-loop calculation presented in Ref. [35] and computed the LPM resummation based on Refs. [34,36]. In our implementation, we used a window function to restrict the LPM contribution to frequencies around the light-cone. The NLO perturbative spectral function complemented by the LPM contribution is called NLO+LPM in the following. For the spatial momentum of k = πT , we display this spectral function in Figure 5 using dashed lines.

D. N = 4 supersymmetric Yang-Mills theory
In the N = 4 supersymmetric Yang-Mills (SYM) theory, the thermal spectral functions can be obtained analytically not only in the weak coupling limit, but in the strong coupling (and large-N c ) regime as well [40]. The field content of the theory is an SU(N c ) gauge field together with massless scalar and fermionic fields in the adjoint representation of the gauge group. In the limit of infinite 't Hooft coupling and infinite number of colours, one can determine the spectral function by making use of the AdS/CFT correspondence and then numerically solving an ordinary differential equation. Our primary interest, the spectral function in the transverse channel, is a smooth function having a similar asymptotic behavior -proportional to ω 2 -at high frequencies as the transverse spectral function in thermal QCD. Although the spectral functions in SYM in the strong coupling limit cannot quantitatively describe those of thermal QCD, their qualitative features are nevertheless instructive and may well be of relevance at T 250 MeV. An example of such a spectral function at strong coupling is shown in Fig. 5 for a spatial momentum k = πT with a solid line.

A. Ensembles and statistics
We employ dynamical O(a)-improved Wilson fermions to simulate two degenerate flavours of quarks with an in vacuo pion mass around 270 MeV. The details of the lattice action can be found in Ref. [41] and references therein. The temperature is T 250 MeV, well above the transition temperature, estimated to be about T c 211 MeV in N f = 2 QCD [42]. We use the ensembles already presented in Ref. [24]; cf. Table I in the latter reference. Although the pion mass is larger than its physical value on these ensembles, we emphasize that quark-mass effects on the correlator are suppressed by (m/T ) 2 in the chirally symmetric phase. For our two coarsest ensembles, labeled as F7 and O7, the bare parameters have been set identical to the vacuum F7 and O7 ensembles used in Ref. [41,43], while for the finer ensembles, labeled as W7 and X7, the tuning to the line of constant physics was performed in Ref. [44]. The lattice spacings for these ensembles are around a 0.066, 0.049, 0.039 and 0.033 fm with an error of about 1% [43]. The physical volume for all ensembles is around L 3.1 fm.

B. Lattice observables
In this section, we introduce the various discretized lattice correlators that we used to perform the continuum extrapolation in Sec. III C. We consider the two-point function of the isovector vector current in QCD with exact isospin symmetry. This allows for precise comparisons with weak-coupling predictions in N f = 2 QCD. As discussed at the end of Sec. II A, the correlator of the electromagnetic current in the physical QGP can be obtained approximately by multiplying our correlator by the charge factor C em = 2/3.
The bare local and the conserved vector current are defined as and (12) respectively, where Ψ = (u, d) represents the isospin doublet of mass-degenerate quark fields and τ 3 is the diagonal Pauli matrix. As in Ref. [24], we have not implemented the additive O(a) improvement of vector currents, as the contribution of the improvement terms to the two-point functions would be suppressed by the quark mass in the chirally restored phase. Using the currents above, we define the following bare unimproved correlators Under time reflections the local-conserved correlator is transformed into the conserved-local one, , and vice versa. Using this fact, we averaged the two appropriately, and we refer to it with the the superscript LC in the following. For the transverse correlator we need only the spatial components of the correlators, which are defined on the lattice sites according to Eqs. (13)(14)(15)(16). In the case of local-conserved or conserved-local correlators, however, we note that the charge-charge correlator can be evaluated on site τ by averaging G LC 00 (τ + a/2, k) and G LC 00 (τ − a/2, k). To obtain the correlator in the transverse channel, we first evaluated Imposing time-reversal symmetry and translation invariance, we then symmetrized the correlators in the time direction and averaged over the momentum orientations: where N k is the number of momenta of norm k.

C. Continuum extrapolation
For the continuum extrapolation, we first interpolated the lattice correlators to the time separations which correspond to our finest lattice, X7. We applied two interpolation methods, Akima and monotonic cubic spline [45][46][47]. For the interpolation, we normalized G T /G 00 by the LO continuum transverse correlator, i.e. we multiplied by T 3 /G LO T in order to have a flattened interpolant. After the interpolation we removed this factor. 2 In order to avoid the renormalization of the local currents, we divided the bare correlator by the bare static susceptibility computed using the same discretization In the following, we omit the label T denoting the transverse channel.
For the continuum extrapolation, we carried out fits by using only a single discretization or multiple discretizations simultaneously of the transverse current correlators for each τ and k. When using multiple discretizations, we performed constrained correlated fits using the following fit ansatz where c 0 is the estimate of the continuum limit from a particular fit, and the c α 1 and c α 2 parameters are characterizing the approach to the continuum of the different discretized correlators. We took into account the correlations between the different discretizations, and minimized the chi-squared statistic where the index e runs over the ensembles and α, α = LL, LC, CC. A regularized covariance matrix Cov was obtained by multiplying the off-diagonal elements by 0.95. The regularization had a non-negligible effect only in the case of linear fits (i.e. when setting the c α 2 coefficients to zero), for which it led to an increased number of acceptable fits.
To increase the robustness of the continuum limit, we implemented a multiplicative treelevel improvement of the lattice data where s G α TLI is the tree-level improved correlator, s G LO and s G α LO are the continuum and the lattice leading-order perturbation theory results, respectively. The tree-level improvement reduces the difference between the various discretizations at finite lattice spacing, which results in more fits having acceptable p-values. It reduces the continuum extrapolated value at smaller Euclidean time separations, τ T < 0.25. The effect of the improvement turned out to be much milder -almost negligible on the final continuum result -for larger distances, τ T ≥ 0.25.
In order to estimate the systematic errors, we carried out extrapolations using the treelevel improved as well as the unimproved data. Further systematic variations included using two interpolation methods to interpolate to the same τ T points on all ensembles and varying the number of data points used in the fits. For the linear fits, only the three finest ensembles have been used and we investigated the robustness of the results by omitting one or two data points belonging to the coarsest two out of three ensembles. In the case of quadratic fits, we proceeded similarly, but we omitted one or two data points from the coarsest three out of four ensembles. When we left out more than one point, we always discarded them from the same discretization. Furthermore, we carried out extrapolations by using only one or two discretizations. These various changes in the analysis resulted in a total number of 6  104 fits. The χ 2 values, the number of fit parameters and the number of data points have been used to calculate the Akaike Information Criterion (AIC) weight of each fit [48,49]. Using these weights, we built a histogram and quoted the median as our final continuum result. The histogram is used in the later stages of the analysis. We also perform the continuum limit of the static susceptibility, which requires including the Z V renormalization constant when the local discretization of the current is used. To obtain the values of Z V at the bare couplings of our ensembles, we utilized the parametrization of Z V (g 2 0 ) given in Ref. [50]. Then, adopting a similar procedure to determine the continuum limit of the static susceptibility as we used for the transverse correlator, we obtain χ s /T 2 = β 3 G 00 (β/2, 0) = 0.882(11) stat (19) sys , see the right panel of Fig. 1. Compared with our result of Ref. [24], we quote a more conservative systematic error due to the continuum limit primarily due to the inclusion of quadratic fits in a 2 in this work. We have also investigated that the use of other determinations of the renormalization factor [51] and the inclusion of the mass-dependent improvement, which lead to sub-leading differences compared to the systematic error from the continuum extrapolation.
In order to obtain G T /T 3 in the continuum, we multiplied the continuum limit of G T /G 00 by the continuum estimate of G 00 /T 3 . The transverse correlator divided by the charge-charge correlator is shown in the left panel of Fig. 2 for k/T ≈ 4.97, and in units of temperature in the right panel. The statistical error on G T /T 3 is typically around 0.25 − 0.6%, while it is in the range 1.1 − 1.4% for G T /G 00 .

D. Continuum-extrapolated correlators vs. theory predictions
Having obtained the transverse-channel correlators in the continuum, we compare them to various theory predictions. In this comparison, we neglect the O((m/T ) 2 ) quark-mass effects present in the lattice results, since m/T ≈ 0.05 in our simulations.
We find that the LO correlator is about 5 − 20% larger at Euclidean time separations τ T > 0.4 for momenta k/T 3.85; see Fig. 2. The deviation is smaller, around ∼ 5%, for smaller momenta. It reduces towards smaller time separations, and below a certain (k-dependent) τ T value, the LO correlator is smaller by about 5% than our continuum result. The NLO+LPM correlator, however, is only about 3 − 4% larger than the lattice result for all momenta and for all time separations 0.17 τ T ≤ 0.5 for which we could reliably determine the continuum limit. We conclude that the perturbative corrections to the LO correlator noticeably improve its agreement with our lattice correlator. The strongly coupled N = 4 SYM theory result also provides a relatively good description, even though that result concerns a different non-Abelian gauge theory and applies in the large-N c limit. The lower panels of Fig. 2 show the ratio of the perturbative and the SYM results to the lattice results.
A word on the multiplicative normalization of the theory predictions is in order. For the perturbative predictions, while no prescription is needed for G(τ, k), a choice must be make to obtainḠ(τ, k). Here we have normalized both the LO and the NLO+LPM correlators by the O(g 6 ln g) susceptibility [52]. In Ref. [52], the coefficient of one of the undetermined O(g 6 ) term was estimated to be C(N f = 2) ≈ −45, which results in χ pert /T 2 ≈ 0.83, the value which we used for Fig. 2. This value, however, is 6% lower than the continuum extrapolated value we obtained. Based on our lattice continuum result, C(N f = 2) ≈ 33 would result in a better agreement for χ s /T 2 . However, we note that the contributions from successive higher orders in the perturbative calculation of χ pert /T 2 are similar in size, so the estimation for C(N f = 2) is to be treated with caution. As for the strongly coupled, large-N c N = 4 SYM correlator, we note thatḠ(τ, k) is a natural quantity to compare across different thermal systems; in particular, the dependence on the number of colours drops out. In order to compare the correlator G(τ, k) itself, a certain choice must be made for the susceptibility. We have chosen to set χ SYM /T 2 = N 2 c /8 = 9/8.

IV. ANALYSIS OF THE TRANSVERSE-CHANNEL SPECTRAL FUNCTION
Given the transverse-channel Euclidean correlation function in the continuum limit, we proceed to analyze the corresponding spectral function via Eq. (6). Here, we present an analysis based directly on G T /T 3 in Secs. IV A and IV B, since we found the correlator to be statistically particularly precise. In addition, we have seen that the NLO+LPM correlator is only a few percent off the lattice correlator, which encourages us to use that prediction as a baseline in Sec. IV B. In contrast, since the primary continuum-extrapolated quantity was s G = G T /G 00 , Ref. [53] contains an analysis based on that ratio.
We start in Sec. IV A by using the Backus-Gilbert method in order to perform the comparison between lattice data and theory predictions in frequency space without introducing any model dependence. Sec. IV B then presents fits to the lattice data in order to determine the spectral function, with a particular focus on lightlike kinematics, ω = k. For simplicity of notation, we omit the spatial momentum from the arguments in the following sections.

A. Backus-Gilbert method: smeared spectral functions vs. theory predictions
The Backus-Gilbert method is a model-independent approach to overcome the spectral reconstruction problem [25]. By using this method one can determine a local average of the spectral function around a given value of the frequency. In the present case it is also favourable to introduce a rescaling function, f (ω), which in particular removes the singularity at vanishing frequency of the kernel K(ω, τ ). Thus the new kernel is defined as with the rescaling function f (ω) being specified later, but satisfying f (ω) ∝ ω as ω → 0. The smeared, rescaled spectral function,ρ(ω)/f (ω), is defined in this case aŝ where ∆(ω, ω ) is the so-called resolution function or averaging kernel, which is completely specified in terms of some coefficients, g i (ω), and is given as Inserting back ∆(ω, ω ) of Eq. (25) into Eq. (24), we find that the filtered spectral function is the linear combination of the Euclidean correlator data, The coefficients, g i (ω), are determined in the Backus-Gilbert method by minimizing the second moment of the squared resolution function This minimization ensures that the width of resolution function is as small as possible and that, at fixed ω, ∆(ω, ω ) has unit area. The minimizing solution is given as where The matrix A is ill-conditioned in practice, therefore an error functional is added to A[g] which serves as a regulator. As a consequence, A has to be replaced by A reg in Eq. (29), and 0 ≤ λ ≤ 1 is the regularization parameter compromising between stability and resolution. The smaller the value of λ, the larger the regularization. In Eq. (31), Cov[G T ] stands for the covariance matrix of the Euclidean correlator. We note that in our numerical implementation we worked in the units of temperature. We emphasize again that from the filtered spectral function one cannot modelindependently determine the value of ρ(ω) itself, but it could be useful to compare it to a similarly smoothened spectral function coming from other approaches. Once the g i (ω) coefficients are determined via Eq. (29) by utilizing the covariance matrix of the data, the same resolution function can be used to build, for instance, the perturbative filtered spectral function. In Fig. 3, we compare the filtered spectral function obtained from the continuum .
As Fig. 3 shows, the filtered, rescaled spectral function obtained using the lattice data is somewhat below the perturbative one, especially for small momenta (k/T ≤ 3.14). At small frequencies, below ω/T ∼ 5, the filtered spectral function is between the weak-coupling and the N = 4 SYM theory filtered spectral function. Using all nine data points for the correlator, we found that the condition number of A reg gets more or less tractable (smaller than ∼ 10 7 ) when λ is around 10 −4 or smaller. The elements of A reg are still dominated in this case by the first term of Eq. (31), which are several orders of magnitude larger than the elements of the covariance matrix. When λ 10 −2 , the condition number is above 5 × 10 8 andρ(ω) tends to have unphysical wiggles. By omitting e.g. the first data point from the correlator, one can use larger values of λ, but one less coefficient. The results, nevertheless, do not change significantly.
The resolution functions, ∆(ω, ω ), are quite wide and they blur the fine details of the spectral function. They are almost identical for different momenta when fixing the value of λ. Their dependence on λ is also very mild, but increases when going to a higher target ω. Some examples can be seen in Fig. 4. We obtained similar results to those discussed above using the covariance matrix of G T /G 00 instead.
In summary, we have found that the filtered spectral functions derived from the continuum-extrapolated lattice data are close to the NLO+LPM spectral function filtered with the same resolution function, and that the agreement increases both with increasing k and increasing ω, up to ω ≈ 10T . Since beyond that point the resolution function extends to very high frequencies, residual cutoff effects in the lattice results could explain the differences observed with the NLO+LPM curves in that regime. At k = πT , the non-perturbative filtered spectral function undershoots the perturbative prediction, a point already noted in [23] for the 2ρ T + ρ L channel relevant for the dilepton rate. Finally, we remark that the value of [ρ(ω, k)/f (ω)] ω=k found here is very much in the ballpark of the values obtained for [ρ(ω, k)/f (ω)] ω=k with various fit ansätze for the spectral function (see Fig. 5 below), even though the resolution function is rather broad compared to k.

B. The transverse-channel spectral function from fits to the Euclidean correlators
We now turn to turn to our direct attempt at determining the spectral function by fitting various ansätze to the transverse correlator. When specifying possible ansätze describing the transverse spectral function, we have taken into account the facts that it has to be odd in ω, i.e. ρ(−ω) = −ρ(ω), as well as positive for ω > 0, as shown in Ref. [24]. While the ansätze were chosen to be odd by construction, models were only excluded a posteriori if they violated positivity with a 68% confidence level.
The high-frequency behavior of the spectral function is dictated by perturbation theory, therefore we assumed the following form for the transverse spectral function where ρ pert is the NLO prediction for the transverse spectral function complemented with the LPM contribution near the light-cone, and is a smooth step function which controls how fast the perturbative contribution falls off around ω 0 as ω is lowered. Using the decomposition of Eq. (33), we ensure that the perturbative part gives the dominant contribution above the chosen value of ω 0 , which we call the matching frequency. Moreover, the transition from the perturbative regime -assumed to be valid in the ultraviolet -can be realized in a smooth way without constraining any of the coefficients of the fit function. For the fit function, ρ fit (ω), we considered the following two possible ansätze: where N p denotes the number of fit parameters, and ρ fit,2 (ω) where the free parameters have been chosen to be A 0 , B 0 , B 1 , and A 1 has been fixed imposing continuity, In the latter case, we also carried out fits by setting B 0 to zero, i.e. having only two fit parameters. The spectral function of Eq. 33 including ρ fit,1 (ω) or ρ fit,2 (ω) is referred to in the following as the polynomial or the piecewise polynomial ansatz, respectively.
These ansätze are motivated by their ability to describe the spectral function of the strongly coupled N = 4 SYM theory as well as that of the NLO+LPM resummed perturbation theory, to a satisfactory level. The functional form of Eq. (36) is more suitable for the spectral function obtained with the AdS/CFT approach in the N = 4 SYM theory, whereas the built-in non-differentiability at the light-cone in the piecewise polynomial ansatz of Eq. (37) is more apt at describing the cusp present in the NLO+LPM result. More details can be found about the expressivity of these ansätze in Appendix B, in which we present the outcome of mock data analyses and to which we return in the next subsection.
After inserting ρ(ω) from Eq. (33) into Eq. (6), we solved the correlated χ 2 -minimization problem to determine the unknown coefficients. We show some representative fit results with good χ 2 values in Fig. 5 for the momentum k = πT . We explored many variations in the fit procedure and took into account the statistical and systematic error of the continuumextrapolated Euclidean correlators. More details on this and the method of estimating systematic errors on the effective diffusion coefficient, D eff (k), can be found in Appendix A. Our result for D eff (k) extracted using the polynomial ansatz is displayed on the right panel of Fig. 6. The line within the band represents the median of the distribution of results obtained, while the width of the band indicates the position of the 16th and 84th percentile.
The piecewise polynomial fit ansatz turned out to yield a sizeable spread of results for the effective diffusion coefficient. This spread comes from the results for D eff actually falling into two well separated intervals. Fit results with N p = 2 tend to lead to results in the lower interval, while the results obtained with N p = 3 populate both intervals. To illustrate the point, on the left panel of Fig. 6 we display the results for the effective diffusion coefficient D eff (k) obtained with N p = 2 and N p = 3 separately as two coloured bands. This doublypeaked distribution of results for D eff is associated with the behaviour of the spectral function around lightlike kinematics. When using this ansatz to fit the correlator, we obtained spectral functions possessing either a minimum or a spike-like maximum at the light-cone frequency. Representative fit results are shown on the left panel of Fig. 5. Since neither the NLO+LPM weak-coupling nor the SYM spectral functions have a maximum at ω = k, we also investigated the fit results obtained by excluding the fits satisfying at least at one standard deviation. With this qualitative theoretical prejudice in place, the fit results are significantly more constraining. The D eff values obtained this way are displayed  36)) are displayed in red. Analytical results from perturbative QCD [11] using α s 0.25 or 0.31 (dashed lines) and from the strongly coupled N = 4 SYM theory (grey line) [40] are also included as well as an upper bound obtained from the analysis of the T − L channel (dotted line) [24]. We use the value χ s /T 2 = 0.88(2) to obtain T D eff from the lattice data, but the free susceptibility to obtain T D eff from the N f = 2 weak-coupling photon rate. on the right panel of Fig. 6, where they are denoted as "no-peak" solutions. Excluding fits possessing the feature Eq. (38) is also affirmed by the upper limit of the results obtained by analysing the spectral function in the T−L channel [24,29]. As can be seen by comparing the two panels of Fig. 6, a large portion of the solutions with a peak at lightlike kinematics can be excluded by our previous analysis of the T − L channel. We note that the latter analysis has been carried out using the same ensembles that we employ in the present study. For comparison, Fig. 6 also displays the weak-coupling results (dashed lines) obtained directly for the photon rate in Ref. [11], as well as the strongly coupled SYM theory results (solid grey line).

C. Final result for the photon emissivity extracted from the fits
In order to arrive at our final estimate for the photon emissivity, we need to judge the reliability of our fit ansätze in extracting the quantity D eff . For this we return to the analysis of mock data, presented in detail in Appendix B. There, we found that the fit ansatz functions tend to overestimate the photon rate in the investigated models, irrespective of whether G T /G 00 or G T /T 3 is used. We therefore first derive an upper bound for D eff , and hence for the photon emissivity. By fitting a linear ansatz a×k/T +b to T D eff obtained from the polynomial ansatz in the k/T range [π/2, 2π] and then plugging the effective diffusion coefficient, T D eff , into the formula giving the thermal photon emission rate per unit volume, Eq. (2), we arrive at the following upper bound, In Eq. (39), C em denotes the charge factor equal to 2/3 in the N f = 3 theory. We recall that the static susceptibility has been determined in Sec. III C, χ s /T 2 = 0.88(2) at T 250 MeV.
As far as a lower bound on D eff is concerned, we distinguish two regimes, k > πT and k ≤ πT . In the former case, using the NLO+LPM correlator as input, the central value of the output for the effective diffusion constant is usually at least 50 − 80% larger than the true value, and our error estimate typically does not cover the true value in the range 3.5 ≤ k/T ≤ 5.5 (see the right panel of Fig. 9). In the case of the N = 4 SYM theory mock data, the results from the two ansätze bracket the true result for intermediate and smaller momenta k/T 4.5, but our ansätze overestimate T D eff by 5 − 25% at larger momenta. Therefore, for k > πT we extend the lower bound down to the weak-coupling prediction in Fig. 6 (right panel), which we parametrize in Eq. (40).
For the mock data tests performed in the momentum range k ≤ πT on the other hand, the resulting error band always covers the true value for at least one of the two ansätze that we use. As the right panel of Fig. 6 shows, the lower bound is driven by the results of the piecewise polynomial ansatz, which decreases until reaching vanishing D eff values at our smallest momentum. A parametrization of the lower bound can be given by a linear function in k/T that connects zero at k = πT /2 and the NLO weak-coupling result at k = πT . Thus we parametrize our lower bound according to From the observations made above, it is clear that this lower bound is influenced by the theory prejudice that the spectral function at lightlike kinematics does not 'dive' down to even smaller values than the weak-coupling results [11] predict for realistic values of α s ; see We remark that the curves and bands displayed in Fig. 6 have a precise statistical meaning based on percentiles (16 th , 50 th , 84 th ) of the distribution of results obtained by applying a set of procedural variations. The band defined by the k-dependent bounds of Eqs. (39,40) summarizes the results, giving equal weight to both fit ansätze that we have used; it is displayed in Fig. 7.

V. CONCLUSION
In this study, we have presented the first investigation of the transverse channel Euclidean correlator at finite momenta, using N f = 2 O(a)-improved dynamical Wilson fermions at around T 250 MeV corresponding to 1.2T c . We carried out a simultaneous continuum extrapolation using three discretizations of the correlators of the isovector vector currents. We used four ensembles with lattice spacings in the range a 0.033 -0.066 fm. We compared the filtered spectral functions obtained from the continuum extrapolated transverse correlator via the Backus-Gilbert method to the spectral function of perturbation theory at next-to-leading order and the one obtained in the strongly coupled N = 4 supersymmetric Yang-Mills theory. The small-frequency lattice results lie between the filtered spectral functions obtained in these two theories. In order to determine the thermal photon emission rate, we fitted the correlators using polynomial and piecewise polynomial fit ansätze for the underlying spectral function. We validated the expressivity of these ansätze by performing mock tests using the spectral functions obtained in perturbation theory as well as in the N = 4 super Yang-Mills theory. We compared our results for the effective diffusion coefficient to the results obtained in these theories as shown in Fig. 6, and also to the results obtained by analysing the difference of the transverse and the longitudinal channels, presented earlier in Ref. [24]. The ranges obtained are compatible with the weak-coupling as well as with the AdS/CFT results. Our final result for the photon emissivity is displayed in Fig. 7 as a hatched band, for which we provide a parametrization in Eqs. (39,40). For momenta below 1 GeV, the upper edge of the band is more constraining than the estimate derived from the analysis of the difference of the transverse and longitudinal channels. We obtain our strongest constraint on the photon emissivity around k = πT 0.8 GeV, In the future, we plan to (linearly) combine the present analysis with our previous one [24] to provide estimates of the dilepton rate at invariant masses M 2 + − = ω 2 − k 2 > 0. We are also investigating a qualitatively different approach to the photon rate by computing zerovirtuality correlators on the lattice [54,55]. Whilst numerically challenging, this approach allows one to avoid confronting the inverse problem.
German Research Foundation) through the Cluster of Excellence "Precision Physics, Fundamental Interactions and Structure of Matter" (PRISMA+ EXC 2118/1) funded by the DFG within the German Excellence strategy (Project ID 39083149). M.C. was supported by the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 843134-multiQCD. T.H. is supported by UK STFC CG ST/P000630/1. The generation of gauge configurations as well as the computation of correlators was performed on the Clover and Himster2 platforms at Helmholtz-Institut Mainz and on Mogon II at Johannes Gutenberg University Mainz. We have also benefitted from computing resources at Forschungszentrum Jülich allocated under NIC project HMZ21. We made use of the following libraries: GNU Scientific Library [47], GNU MPFR [56] and GNU MP [57].

Appendix A: Technical aspects of the fits for the spectral function
In this appendix, we provide some details on the procedure we followed in Sec. IV B to obtain spectral functions via fits to the Euclidean correlators. After inserting ρ(ω) from Eq. (33) into Eq. (6), we solved the correlated χ 2 -minimization problem to determine the unknown coefficients. When doing so, we regularized the covariance matrix by multiplying the off-diagonal elements by 0.95. To estimate the statistical error, the minimization problem has been solved for every jackknife sample of the continuum correlator by using the covariance matrix obtained from the full data. To quantify the systematic uncertainty of the reconstruction coming from the systematic uncertainty of the continuum limit correlator values, we used the continuum extrapolation fit results for the 16th, 50th and 84th percentile of the AIC-weighted histogram obtained for each τ T value. However, we did not take into account all possible combinations of these histogram representatives, but chose random subsets containing 20 -100 combinations for each momentum.
In order to estimate the systematic error from making parameter choices for the perturbative results or changing various parameters when fitting, we performed several fits using a set of plausible variants. Regarding the perturbative input for our analysis, the NLO+LPM result depends on the coupling constant. Using the four-loop formulae of Ref. [58], we determined the coupling constant by setting the renormalization scale either to µ = 2πT or to µ = 3πT . The coupling constants corresponding to these choices are α s 0.31 and α s 0.25, respectively. We took Λ from the FLAG report [59]. We applied two values for the matching frequency, ω 0 = 10T and ω 0 = 12T , which correspond to 2.5 GeV and 3 GeV, respectively. For the parameter ∆ in Θ(ω, ω 0 , ∆), which governs the falloff of the NLO+LPM contribution towards the infrared, we set ∆ = 2T ,  allowing also for a 20% variation. As a further systematic variation, we adjusted the fit ranges in τ T to include more or fewer data points in the fits. The variations we applied for the systematic error estimation are summarized in Table I. Due to these variations, we collected 750-2000 fits with p-values larger than 0.05 for each momentum when using the polynomial ansatz, and around 15-30% of these fits had p-values larger than 0.5. When using the piecewise polynomial ansatz, we also obtained a lot of fits with good χ 2 (and p-values). The fraction of the number of fits with p-values greater than 0.5 was around 15-30% in that case as well. We built an AIC-weighted histogram from the fit results that we used to estimate the systematic errors [48,49].  we used r(τ )/s in step 2, and instead of a rescaling, we determined the jth jackknife mock correlator value asḠ (j) mock (τ ) =Ḡ mock (τ ) + r(τ )Ḡ (j) (τ ) −Ḡ mock (τ ) /s in step 5. Since we extrapolated s G = G T /G 00 to the continuum (see Sec. III C), we also carried out the mock analysis using this observable. When doing so, G, G mock and G model should be replaced by s G, s G mock and s G model , respectively, in the above procedure, and we refer to this ratio of the chosen model when we have e.g. s G model . Since the covariance matrices are quite different for G T /T 3 and for G T /G 00 , the mock analyses starting from the former or the latter result in different outcomes.
When using the NLO weak-coupling mock correlator as an input for the reconstruction via fitting, we applied some variations in the fit setup, similar to the case of the analysis of the continuum extrapolated correlator (Sec. IV B). These include using either two or three fit parameters in the fit ansätze, changing the number of the correlator data points utilized in the fit, and changing the features (ω 0 /T, ∆/T ) of the matching to the UV behavior. Since when producing the mock correlator, we generated multivariate Gaussian variables randomly, to eliminate a possible effect coming from the random input, we used six different mock correlators generated with the procedure discussed above. The systematic error have been estimated using the AIC-weighted histogram of the various fit results, and we used the mock jackknife samples to estimate the statistical error of the mock analyses. The same fit ansatz functions -Eq. (36) and Eq. (37) -have been applied as in the main analysis.
We found that both fit ansätze perform reasonably well in the weak-coupling case when using the G T /G 00 data, see Fig. 8. Namely, above k/T ≈ π, both ansätze reproduce the input spectral functions within errors in a wide range of ω/T values also at small and large frequencies. The errors, however, are typically larger in this case for the polynomial ansatz (∼ 10 -50%, but even could be 100% at certain frequencies), than when using the G T /T 3 data (errors < 2 -8%). At ω = k, however, the polynomial ansatz gives a much larger value of ρ, i.e. a larger photon rate, using either G T /G 00 or G T /T 3 . The piecewise polynomial ansatz could reproduce the features of the NLO+LPM mock data better, because it is capable of producing a sharper dip at the light-cone. With the help this ansatz, we get smaller photon rates at small momenta (below k/T ≈ π), and larger photon rates at larger momenta, but in this latter case with errors spreading towards smaller values, typically covering the true spectral function at the light-cone. Using the G T /T 3 mock data, this ansatz -similarly to the polynomial ansatz -also returns a larger photon rate at all momenta, see e.g. Fig. 9. In this case, the error covers the bottom of the dip only at and below k/T = π.
The reproduction of the mock spectral function can be improved by a certain amount by reducing the errors on the input mock correlators. As Fig. 12 shows, accomodating an error reduction factor of s = 10, the piecewise polynomial ansatz could mimic the dip at ω = k at a more satisfactory level. For comparison, see the right panel of Fig. 8 and also that of Fig. 9. This observation shows that this particular ansatz having only a few parameters has satisfactory expressiveness of reproducing a model spectral function relevant to the physics discussed in this paper. Conversely, it also indicates that the fact that we obtained less faithful spectral function outcomes when not reducing the error is mainly due to the covariance matrix and not due to a wrong choice of ansätze.
Performing mock tests using the strongly coupled N = 4 SYM theory transverse correlators, the polynomial ansatz adequately reproduces the spectral function even for small momenta with errors less than around 5 -10% for frequencies around and above ω = k, see Fig. 10 and Fig. 11. As one goes for higher momenta, the reproduction gets a bit worse, but usually with errors included, one reaches the true values of the input spectral function. The   photon rates given by this ansatz are always higher than the true photon rate, either using the mock G T /G 00 or the G T /T 3 data, see Fig. 10 and Fig. 11, respectively. The deviation increases from about 2% corresponding to smaller momenta up to 25% for high momenta.
For this model, the piecewise polynomial ansatz could also give reasonably good results for high momenta, where it almost coincides with the polynomial ansatz results. For momenta, k/T < 4.44, it always gives a smaller photon rate, not depending on using G T /G 00 or G T /T 3 , but in the latter case the central value is much closer to the true value.
The mock data also served as a numerical crosscheck for the Backus-Gilbert method, both using G T /G 00 as well as G T /T 3 . As Fig. 13 shows, the Backus-Gilbert method is capable of reproducing the smeared, rescaled spectral function up to around ω/T ∼ 16. , as an input. The results obtained with the help of the 6 randomized mock correlators are shown in orange without including the statistical errors. These errors are comparable to the errors of the reconstructedρ using the exact model data.