Application of radial basis functions neutral networks in spectral functions

The reconstruction of spectral function from correlation function in Euclidean space is a challenging task. In this paper, we employ the Machine Learning techniques in terms of the radial basis functions networks to reconstruct the spectral function from a finite number of correlation data. To test our method, we first generate one type of correlation data using a mock spectral function by mixing several Breit-Wigner propagators. We found that compared with other traditional methods, TSVD, Tikhonov, and MEM, our approach gives a continuous and unified reconstruction for both positive definite and negative spectral function, which is especially useful for studying the QCD phase transition. Moreover, our approach has considerably better performance in the low frequency region. This has advantages for the extraction of transport coefficients which are related to the zero frequency limit of the spectral function. With the mock data generated through a model spectral function of stress energy tensor, we find our method gives a precise and stable extraction of the transport coefficients.

In the hadronic and QGP phases, the spectral functions of two point correlation function behave quantitatively different since confinement prevents the two point correlation function of quark or gluon from appearing as the mass pole in real time axis [36][37][38][39][40][41][42]. As for the extraction of transport coefficients, the spectral functions of the stress energy tensor correlation function are required in the description of the Kubo formula. These transport coefficients can be expressed as the low-frequency part of the spectral functions [43][44][45][46][47]. However, the required real-time information, ie. spectral function, is very difficult to be obtained directly through the firstprinciple calculation, since most of the non perturbative QCD computations including lattice simulation and functional QCD methods are carried out in Euclidean space [36]. It is known that the SPF is related to the correlation function by an integral manner with respect to the full momentum spaces. With the limited inputs from Euclidean space, it then becomes an illposed problem where the solution of inverse mapping suffers several undetermined properties such as the existence, uniqueness and stability. The limited numerical data of correlation functions drive to large uncertainties in extracting SPF, i.e., it produces a family of allowed solutions, but most of them are highly oscillated and not physically meaningful [48,49].
In order to obtain the physical solution, many methods have been suggested. For example, a truncated Singular Value Decomposition (TSVD) [48,50,51] is a straightforward method that has been widely employed to analyze the ill-conditioned inverse problem. By treating the high frequency part as noise and then truncating it in SVD method, the approximate solution converges to the exact solution. The regulating term that makes the spectral function less oscillating is another choice as in Tikhonov method [52,53]. The most commonly used and powerful reconstructing approach of the spectral function is the maximum entropy method (MEM) [36,47,[54][55][56], where the uniqueness of the extracted spectral representation can be achieved through introducing a default model for prior conditions.
Recently, deep neural networks have been employed to solve the problem of SPFs in supervised learning [57][58][59]. The previous computations applied a direct mapping from correlations to the SPFs with a long time training on the prepared datasets. The way of supervised learning could quickly get highly accurate predictions as long as the test samples belong to the same distribution of the training set. However, it might break down without warning if the test samples are from different domains' datasets. Stimulating by the recent development of neural networks, we adopted the radial basis functions network (RBFN) [60,61] which is a multilayered perceptron model that is widely used in classification, regression, feature extraction, etc [62][63][64][65]. The main strategy of this approach is to transform the inverse mapping problem into calculating the linear weights of the radial basis functions (RBF), which enables a smooth and continuous reconstruction which hasn't been accomplished by other methods. Besides, the programme runs fast with the adopted matrix method used in this paper.
This paper is organized as the following. Sec. II introduces the notorious problem of spectral reconstruction and briefly arXiv:2106.08168v1 [hep-ph] 15 Jun 2021 reviews several state-of-the-art methods widely used to reconstruct the spectral function. Sec. III describes our RBFN method. Section IV shows the numerical results from our RBFN method, together with a comparison to the results from other traditional methods described in Sec. II. Sec. V summaries this paper and discusses possible future works.

II. SPECTRAL RECONSTRUCTION AND EXISTING METHODS
A. Spectral reconstruction as an ill-posed problem The correlation functions in Euclidean space can be calculated via the first-principle non perturbative approaches [44,47,66]. The SPF ρ(ω, T ) is related to the correlation function G(τ, T ) through an integral spectral representation: where the integration kernel is τ is the imaginary time and T stands for temperature. Above equation belongs to the Fredholm integral equation of the first kind, which is ill-posed since there are many solutions to a certain set of input data. The spectral representation has been studied in several cases. Firstly, the spectral function of the propagator reveals the dispersion relation of the respective particle. As known by the Kallen-Lehmann spectral representation, it gives a general expression for the two point Green function of quantum field theory in vacuum, which is written as: with ρ(p 2 ) being the spectral function of the propagator. Similar forms have been generalized into finite temperature and chemical potential regions where the information of the spectral function are related to the chiral and deconfinement phase transitions directly [67][68][69]. Secondly, another important goal is to obtain the transport coefficients such as the shear viscosity η, which could be derived from the spatial traceless part π i j of stress-energymomentum tensor correlation function via Kubo relation [46]: where The above correlation function is defined in Minkowski space and is deformed to spectral representations via analytic continuation where The Matsubara frequency ω n = i(2n + s) πT in imaginary time formula with fermionic (s = 1) or bosonic (s = 0). And its Fourier transformation defined as which is the standard Euclidean quantity as in Eq.(1) applied in this ill-posed problem. The extraction of transport coefficients requires the giving spectral function to be smooth and stable at the low frequency limit, which will then show to be the advantage of the method developed here. As mentioned above, up to now, the lattice simulations and other functional methods are usually used to compute the correlation functions at a finite set of discrete points in the Euclidean space. Following the obtained numerical data, the integral equation is usually discretized as: Without losing generalization, ∆ω can be incorporated into K. In general, the number of the correlation function data points G (τ i , T ) is at order of ∼ O (10) accompanied with inevitable noises, however, the spectral function defined in a domain which requires around ∼ O (10 3 ) data to be well constructed [36]. One of the most intuitive way is to monitor and minimize the difference between the original G(τ i ) and the ob-tainedĜ(τ i ), whereĜ is the correlation function written by the reconstructed SPF. However, uncertainty remains as multisolutions exist. Therefore, additional prior information of the SPF is taken into account by different methods to help reduce the numbers of the solution, briefly explained below.

B. Traditional Methods
In this subsection, we will introduce some commonly used methods of analyzing the ill-posed problem.

Truncated Singular Value Decomposition (TSVD)
The singular value decomposition (SVD) is a tool to decompose the singular Matrix. After introducing a truncation of SVD, the Truncated Singular Value Decomposition (TSVD) is capable of analyzing the ill-posed problem in linear equation subject [48,50,51]. Since we will also implement TSVD in our RBFN method described below, we will review the TSVD method with details.
In TSVD scheme, the integral kernel is decomposed as [50] K where U and V are orthonormal matrices, and S is a diagonal one.
R n×n , the singularly valued matrix S = diag (s 1 , s 2 , ..., s n ) ∈ R m×n with elements ordered as: In most cases, the singular values s i decrease rapidly to zero and the demanding components are reduced as a consequence. The solution of the spectral is taken the form of whereρ is the reconstructed spectral function deriving from the generalized inverse matrix and the reduced vector space {v i }. As s i goes to zero, the weight of the corresponding basis vector grows infinitely, which mainly contributes to the high frequency part. One observes that a tiny error in the coefficients u T i G, would be amplified by the singular value s i in the denominator, known as the common instability character in ill-posed problems. And such noise is inevitable since the correlation function data G comes from the expensive Monte Carlo simulation. Indeed, as an underdetermined system, the high frequency part is actually out of control due to the complexness of its structure. To precisely predict the low-frequency part of the SPF, hence, it usually sacrifices the fine structure in the high-frequency regime but simply keep the stable parts as it has already been applied in the truncated singular value decomposition (TSVD) method. Finally, we rewrite Eq.(10) asρ where the truncating parameter k is chosen through analyzing the ratio between signal and noise. It's known that the oscillations tend to increase as k increases, thus then the regularized solutionρ k behaves smoother than the originalρ as expected.
The convergence of this method has been examined in [50] by studying the relation between β i and s i , where Here the nonnegative real constant α represents the decay rates of β i with respect to s i . For α > 1, β i decays faster than s i , the regularized solutionρ k converges to the true solution and a larger α leads to a better approximation. While taking noise of G into consideration, a faster decay of β i is required to smooth it. This is called the discrete Picard condition (DPC), which is employed as a verification for TSVD and also other similar regularization methods. The TSVD method is easy to understand and implement. However, the truncated parameter k is a rather arbitrary number. Plus, this integer value is hard to deal with since the optimized solution will change from one to another uninterruptedly. Besides, it is technically difficult to introduce the prior physical knowledge such as positivity, commonly believed asymptotic behavior and so on into the solution.
The solution turns out to be: Plugging into the conventional SVD routine, the solution transforms to The regularization parameter λ modifies the converging condition especially in the region of s 2 i λ. Unfortunately, due to the exponential decreasing of singular values in our mocking system, the Tikhonov method does not improve much in convergence compared to TSVD scheme.

Maximum Entropy Method (MEM)
Another popular strategy for SPF reconstruction is Maximum Entropy Method (MEM). It is built on Bayes's theorem and mainly described by two terms. One of them, χ 2 , is parameterized as the Gaussian likelihood distribution based on the central limit theorem. Another one, named as Shannon-Jaynes entropy S , serves as a regulator to restrain the deviation of the reconstructed ρ(ω) against the default model of m(ω). The Bayes' theorem is in the form of [36,[54][55][56], where D stands for the data points of correlation functions and I represents the prior knowledge about the SPF. The likelihood function reads HereD i is the spectral representation using Eq.(1), σ 2 i is the covariant matrix with all off-diagonal elements zero. This part is nothing but the standard χ 2 fitting, describing the errors between the reconstructed spectral function and the original data points. Only with this part the solutions cannot be settled, and hence, the entropy S is plugged in the prior probability P (ρ| I) to encode the knowledge of SPF, which reads as [36], Again, m(ω) is the smooth ansatz of SPF, and λ is a real and positive parameter. The entropy term exponentially increases while the spectral function deviates from the default model, and then, the oscillated solution would be diminished rapidly. It is proved that the solution is becoming unique after introducing entropy S .
We emphasize here that the traditional methods often assume the solution is the minimally-possible-oscillated function. This smoothness assumption serves as an additional constraint in constructing SPF, adopted in many methods [36,50,52]. To check the robustness and plausibility of such prior estimation in the traditional methods, on the contrary, we here apply a new approach free of this assumption based on the radial basis function networks.

III. RADIAL BASIS FUNCTION NETWORKS
In this section, we will explain the constructed Radial Basis Function Networks, which are used to extract the spectral functions from the correlation data in this paper. Radial Basis Function refers to a function taking form of φ(||x − m||), where || · || is the Euclidean norm and measures the distance of input x and some fixed point m. Radial basis function network is a simple neural network that uses RBFs as activation functions, where the output is a weighted linear combination of RBFs of the inputs. They have been widely applied to interpolate scattered data and approximate multivariate functions [60,[70][71][72], solving equations [73][74][75], etc.
In principle, an arbitrary function ρ (ω) (including the spectral function) can be approximately described by a linear combination of radial basis functions: where φ are the active RBFs, w j is the weight to be determined by optimization, while the centers of a radial basis function m j should be determined artificially before optimization. A number of functions can be applied as the RBF, such as radial distance, thin-plate spline, compact support, etc. In this paper, we employ two types of RBFs, Gaussian and MQ, respectively. They are written as: where the adjustable value of a in Gaussian or MQ is known as the shape parameter, which is essential for the regularization. In general, the shape parameter a is associated with the distance between adjacent centers and controls the smoothness of the interpolating function. With SPFs described as a combination of certain Gaussian or MQ basis functions, the related solution can be more easily obtained. Note that the basis could be altered to some other functions to suit the specific requirement of certain problems.
Radial basis function networks is a three layers feedforward neural network with the active RBFs built in the hidden layer (as illustrated by Fig.(1)). In more detail, the RBFN transform Eq.(20) into its matrix form: whereK is a N × M matrix, which is irreversible. In the following calculation in Sec.IV, we generate N = 30 data points for the correlation function G i , using some specific mock SPF.
To obtain w j , we further implement the TSVD method 2 described above: whereũ i ,s i ,ṽ i are decomposed according to Eq.(9) fromK ik defined in Eq. (23). The summation cutoff k is set by hand and equal to 10 in the following calculation, which will be explained in the appendix. Compared with the traditional Tikhonov method and Maximum Entropy Method mentioned above, which treat the SPF as discrete data points, our RBFN scheme is genuinely meshless and mathematically straightforward. Besides, compared with other neural networks methods, such as supervised learning approach [57][58][59], it is rapidly trained and free from the bias and overfitting problem. 1 We found that the matrix with M = N = 500 is sufficient to construct the several spectral function presented in this paper. Larger matrix does not improve the results. 2 w i can also be obtained by machine learning using the gradient descent algorithm. While for the simple network structure shown by Fig.(1), it costs more calculation time and the results may not be stable, compared with the TSVD method used here.

IV. RESULTS
In this section, we will implement RBFN method for the correlators generated by two types of mock SPF, associated with propagators [76][77][78] and stress energy tensor [47], respectively. The first type can be applied to describe the quasi-particles, which are especially useful for studying the phase transition from hadron phase to QGP phase. The second type of SPF contains a resonance peak at low frequency and a continuum part at large frequency, which can be used to extract the transport coefficients, such as the diffusion coefficient of heavy quarks.
We firstly construct the propagators by mixing several Breit-Wigner distributions: with where A i is the normalization parameter, M i denotes the mass of the particle, carrying the location of the peak, and Γ i is the width. For the calculations in Fig. (2) and Fig. (3), the mock SPFs are constructed through combination of two Breit-Wigner peaks using A 1 = 0.8, M 1 = 2, Γ 1 = 0.5; A 2 = 1, M 2 = 5, Γ 2 = 0.5 (para-I). For the calculations in Fig. (4), the parameters are set to A 1 = −0.3, M 1 = 2, Γ 1 = 0.5; A 2 = 1, M 2 = 3.5, Γ 2 = 0.5 (para-II), which make the first peak of the mock SPF become negative. With the constructed mock SPF, we then generate the Euclidean correlation functions G(τ i ) according to Eq.(1) (T = 0), together with noise added to the mock data: Here we generate 30 discrete correlation data evenly distributed between τ = 0 and 10. The noises are randomly generated by the Gaussian distribution similar to [36] where the mean are set to 0 and the width is a adjusted parameter that corresponds to different noise levels. Fig.(2) compares the predicted spectral functions from RBFN, TSVD, Tikhonov and MEM using the correlation data described above. For each prediction, the solution is obtained by averaging 10 results using the sampled correlation data with random Gaussian noises. The shaded area denotes the uncertainty for each model prediction. Although the Tikhonov method shows a tiny improvement with the infrared parameter λ, both Tikhonov and TSVD share the same oscillation behavior at the low frequency part. In contrast, RBFN and MEM provide better predictions at the low frequency part. The results from RBFN do not oscillate and almost reproduce the first peak of the mock SPF for the correlation data with smaller noise = 0.00001. This is especially important for such a task of extracting the transport coefficients from the Kubo relation described by Eq.(4). Our RBFN is the only method that reduces the oscillation in the low frequency area, compared with the other three commonly used approaches.
At high frequencies, all four studies fail to reconstruct the second peak of the mock SPF from the discrete Euclidean correlators with larger gaussian noises = 0.001. After reducing the gaussian noises to = 0.00001, both RBFN and MEM can approximately reproduce the second peak of the mock SPF. In contrast, Tikhonov and TSVD methods fail to do so. Although the behavior of RBFN results at high frequency is somewhat spoiled, it might be improved by imposing the positivity condition, which we would like to leave to further investigation.
In Fig.(3), we compare the predicted spectral functions from our RBFN method using two types of RBFs called Gaussian and MQ described by Eq. (21). The correlation data are generated with the same mock SPF as used in Fig.(2), using different levels of Gaussian noises. These two types of RBFN predict similar results, while the one with Gaussian RBF is more powerful to reconstruct the locating of the peaks for the spectral function. It also almost describes the width of the first peak for the mock SPF.
In Fig.(4), we test our RBFN (with the Gaussian RBF), using the correlation data generated from the two-peak mock SPF with the first peak turning negative (para-II as described above). It turns out that our RBFN works consistently well in constructing such spectral functions at the low frequency limit. The location of the peak has been well represented within a range of allowable errors, especially for the correlation data with smaller Gaussian noises. Again, the behaviors of the constructed spectral function are somewhat spoiled at high frequency regime for the correlation data with larger noises, which are similar to the cases in Fig.(2) and Fig.(3). Now we test our RBFN using another type of spectral function associated with the heavy quark diffusion in the QGP [47]: where χ 00 is the quark number susceptibility, η D = T M 0 D is the drag coefficient and M 0 is the thermal quark mass. Fol- lowing [47], we set M 0 to be 1 GeV and η D ≈ 0.9T 2 M 0 . The transport peak in the infrared part determines the transport coefficients, while the continuous part in the perturbative region can be calculated perturbatively. Therefore, we use the analytical form above 1 GeV and apply our RBFN method with Gaussian RBF to determine the spectral function below.
Using Eq.(23) and the above spectral function, again, we generate 30 discrete correlation data that are evenly distributed between τ = 0 and 10. To testify the stability, we also put small random truncated Gaussian noises with width 1 and truncation between −5 MeV and 5 MeV to M 0 to mimic the possible deviation between the numerical data and the analytical form of the continuous part.
The upper panel of Fig.(5) shows our RBFN's predictions for the low-frequency parts of the spectral function. In general, our method gives a stable and smooth spectral in the infrared regime that nicely fits the ground truth described by Eq.(28) for various temperatures. Due to numerical instability closing zero, we have applied a cut-off for ω at ∼ 10 −2 , which corresponds to ω/T ∼ 10 −1 in the figure. We have added noises to M 0 in the first place, and that is where the uncertainties mainly come from. The quark diffusion coef-ficient D can be obtained from the diffusive Kubo formula D = 1 6χ 00 lim ω→0 ω . As shown by the lower panel of Fig.(5), our prediction is consistent with the analytical result obtained from Eq.(28).

V. SUMMARY
A direct computation of spectral function is extremely difficult in QCD because of confinement. The reconstruction from numerical data computed in Euclidean space is an alternative way. As a typical ill-posed problem, several algorithms have been applied in reconstructing such as TSVD, Tikhonov method, and MEM. In this paper, we developed a machine learning technique for the extracting process. Unlike other commonly used discretized schemes, a continuous neural networks representation based on the radial basis functions has been adopted.
We first applied it for building the spectral of propagators. Our method shows a consistent reconstruction for different types of propagators which are parametrized with several Breit-Wigner distributions. In detail, the location and width of peak for the propagator with positive definite and negative spectral can be both well reproduced within allowable errors. This is especially useful considering the confinementdeconfinement phase transition, since it has been widely conjectured that the spectral of confined states have a negative part in contrast to that of the deconfined states which are positively described.
We also compare two types of RBF in our analysis, Gaussian and MQ ones. It is found that both obtain the SPF efficiently, but Gaussian RBF leads to a better fitting for the location, as well as the width of the peak. Therefore, the Gaussian RBF generally better suits the problem considered here, the reconstruction of SPF.
Moreover, checked by a known formulation of the spectral of energy stress tensor, our approach shows a significant performance in the low frequency region and presents a stable result for the diffusion coefficients comparing to other traditional ways. This benefit is with great importance for the future studies of transport coefficients. Its applications in the low energy regime, combining with the non-perturbative QCD calculations, will be explored elsewhere.
VI. APPENDIX   Fig.(6) and Fig.(7) show the basis vectors v i for different shape parameters a in Gaussian RBFN and MQ RBFN, obtained by the TSVD method in Eq. (24). As the index i increases, v i strongly oscillates with ω. For MQs RBFN, a modifies the magnitude of v i , but does not change their shape. For Gaussian RBFN, a affects both the shape and magnitude of v i . In the calculations shown in Sec.IV, a is set to be around 0.4, which improves the positivity of the predicted SPF. Without such a physical constraint, the spectral function can still be extracted but with slightly larger errors. Except the shape parameter, a truncation of the basis space associated with the index α is also required, similar to the TSVD method. Nevertheless, the subspace can be simultaneously determined in the RBFN method since the effect of noise in the data is presented by α and in turn, this decaying index will help to choose the "good" components according to the discrete Picard condition. We denote the criterion bỹ α i = log (β i /s i ). Ifα i > 0, the respective i-th basis is strongly affected by the noise, while forα i ≤ 0, it indicates the corresponding vectors are stable against the noise. The behavior of {α i } is shown in Fig.(8). It is worth mentioning that the "good" components which selected by the RBF method are not always the minimally-oscillated functions in contrast to the conventional ways. We also expect that the formed truncation scheme, controlling by the noise, will serve as a guideline in the TSVD approach.
We thank M. Asakawa for helpful discussions. The work is supported by the NSFC under grant Nos. 12075007 and 11675004. F. Gao is supported by the Alexander von Humboldt foundation. We also gratefully acknowledge the extensive computing resources provided by the Super-computing Center of Chinese Academy of Science (SCCAS), Tianhe-1A from the National Supercomputing Center in Tianjin, China and the High-performance Computing Platform of Peking University.