QCD Equation of State of Dense Nuclear Matter from a Bayesian Analysis of Heavy-Ion Collision Data

Bayesian methods are used to constrain the density dependence of the QCD Equation of State (EoS) for dense nuclear matter using the data of mean transverse kinetic energy and elliptic flow of protons from heavy ion collisions (HIC), in the beam energy range $\sqrt{s_{\mathrm{NN}}}=2-10 GeV$. The analysis yields tight constraints on the density dependent EoS up to 4 times the nuclear saturation density. The extracted EoS yields good agreement with other observables measured in HIC experiments and constraints from astrophysical observations both of which were not used in the inference. The sensitivity of inference to the choice of observables is also discussed.

The properties of dense and hot nuclear matter, governed by the strong interaction under quantum chromo dynamics (QCD), is an unresolved, widely studied topic in high energy nuclear physics.First principle lattice QCD studies, at vanishing and small baryon chemical potential, predict a smooth crossover transition from a hot gas of hadronic resonances to a chirally restored phase of strongly interacting quarks and gluons [1,2].However, at high net baryon density i.e., large chemical potential, direct lattice QCD simulations are at present not available due to the fermionic sign problem [3].Therefore, QCD motivated effective models as well as direct experimental evidence are employed to search for structures in the QCD phase diagram such as a conjectured first or second order phase transition and a corresponding critical endpoint [4][5][6].Diverse signals had been suggested over the last decades [7][8][9][10][11], but a conclusive picture has not emerged yet due to lack of systematic studies to relate all possible signals to an underlying dynamical description of the system, both consistently and quantitatively.
Recently, both machine learning and Bayesian inference methods have been employed to resolve this lack of unbiased quantitative studies.A Bayesian analysis has shown that the hadronic flow data in ultra relativistic heavy-ion collisions at the LHC and RHIC favors an EoS similar to that calculated from lattice QCD at vanishing baryon density [12].In the high density range where lattice QCD calculations are not available, deep learning models are able to distinguish scenarios with and without a phase transition using the final state hadron spectra [13][14][15][16][17].
This work presents a Bayesian method to constrain quantitatively the high net baryon density EoS from data of intermediate beam energy heavy-ion collisions.A recent study has attempted such an analysis by a rough, piecewise constant speed of sound parameterization of the high density EoS [18].In this study, a more flexible parameterization of the density dependence of the EoS is used in a model which can incorporate this density dependent EoS in a consistent way and then make direct predictions for different observables.
In this work, the dynamic evolution of heavy-ion collisions is entirely described by the microscopic Ultrarelativistic Quantum Molecular Dynamics (UrQMD) model [19,20] which is augmented by a density dependent EoS.This approach describes the whole system evolution consistently within one model.No parameters besides the EoS itself are varied here.
UrQMD is based on the propagation, binary scattering and decay of hadrons and their resonances.The density dependent EoS used in this model is realized through an effective density dependent potential entering in the nonrelativistic Quantum Molecular Dynamics (QMD) [7,21,22] equations of motions, ṙi = ∂H ∂p i , ṗi = − ∂H ∂r i .
Here H = i H i is the total Hamiltonian of the system including the kinetic energy and the total potential energy V = i V i ≡ i V n B (r i ) .The equations of motion are solved given the potential energy V , which is related to the pressure in a straightforward manner [23].
Here, P id (n B ) the pressure of an ideal Fermi gas of baryons and is the single particle potential.Evidently, the potential energy is directly related to the EoS and therefore the terms potential energy and EoS are interchangeably used in this letter.
This model assumes that only baryons are directly affected by the potential interaction [24].A much more detailed description of the implementation of the density dependent potential can be found in [23,25].Note that this method does yield for bulk matter properties, strikingly similar results as the relativistic hydrodynamics simulations when the same EoS is used [25].
To constrain the EoS from data, a robust and flexible parameterization for the density dependence of the potential energy that is capable of constructing physical equations of state (EOSs) is necessary.For densities below twice the nuclear saturation density (n 0 ), the EoS is reasonably constrained by the QCD chiral effective field theory (EFT) calculations [26,27], data on nuclear incompressibility [28], flow measurements at moderate beam energies [7,[29][30][31] and Bayesian analysis of both neutron star obervations and low energy heavy-ion collisions [32].This work focuses on the high density EoS, particularly on the range 2n 0 -6n 0 , which is not well understood yet.Therefore, the potential energy V (n B ) is fixed for densities up to 2n 0 by using the Chiral Mean Field (CMF) model-fit to nuclear matter properties and flow data in the low beam energy region [23].For densities above 2n 0 , the potential energy per baryon V is parameterized by a seventh degree polynomial: where h=-22.07MeV is set to ensure that the potential energy is a continuous function at 2n 0 .This work constrains the parameters θ i and thus the EoS, via Bayesian inference using the elliptic flow v 2 and the mean transverse kinetic energy ⟨m T ⟩ − m 0 of mid rapidity protons in Au-Au collisions at beam energy √ s NN ≈ 2 − 10 GeV.[40][41][42].Important, sensitive observables such as the directed flow [9,43] are then used to cross check the so extracted EoS.The choice of proton observables (as proxy to baryons) is due to the fact that interesting features in the EoS at high baryon density and moderate temperatures are dominated by the interactions between baryons and protons form the most abundant hadron species, actually measured in experiments, for beam energies considered in present work.Further details on the choice of data and calculation of flow observables are given in appendix A, which includes Ref. [44].
The experimental data D = {v exp 2 , ⟨m T ⟩ exp − m 0 } are used to constrain the parameters of the model θ = {θ 1 , θ 2 , ..., θ 7 } by using the Bayes theorem, given by P (θ|D) ∝ P (D|θ)P (θ). ( Here P (θ) is the prior distribution, encoding our prior knowledge on the parameters while P (D|θ) is the likelihood for a given set of parameters which dictates how well the parameters describe the observed data.Finally, P (θ|D) is the desired posterior which codifies the updated knowledge on the parameters θ after encountering the experimental evidence D.
The objective is to construct the joint posterior distribution for the 7 polynomial coefficients (θ) based on experimental observations, for which Markov Chain Monte Carlo (MCMC) sampling methods are used.For an arbitrary parameter set, the relative posterior probability up to an unknown normalisation factor is simply given by the prior probability as weighted by its likelihood.To evaluate the likelihood for a parameter set, the v 2 and the ⟨m T ⟩ − m 0 observables need to be calculated by UrQMD.The MCMC method then constructs the posterior distribution by exploring the high dimensional parameter space based on numerous such likelihood evaluations.This requires numerous computationally intensive UrQMD simulations which would need unfeasible computational resources.Hence, Gaussian Process (GP) models are trained as fast surrogate emulators for the UrQMD model, to interpolate simulation results in the parameter space [12,[45][46][47].Cuts in rapidity and centrality that align with that of the experiments are applied on UrQMD data to create training data for the GP models.The constraints applied to generate the physical EoSs to train the models, the performance of the GP models and other technical details can be found in appendix B.
The prior on the parameter sets is chosen as Gaussian distributions with means and variances evaluated under physical constraints.More details on the choice of the priors are given in appendix C. The log-likelihood is evaluated using uncertainties from both the experiment and from the GP model.The prior, together with the trained GP-emulator, experimental observations and the likelihood function are used for the MCMC sampling by employing the DeMetropolisZ [48, 49] algorithm from PyMC v4.0 [50].
Closure tests.In order to verify the performance of the Bayesian inference method described above, two closure tests are performed.The first test involves constructing the posterior using v 2 and ⟨m T ⟩ − m 0 , simulated with the experimental uncertainties from UrQMD for a specific but randomly chosen EoS.The inference results are then compared to the known 'ground-truth'.Figure 1 shows the posterior constructed in one such test for a random input potential.The black curve in the plot is the 'ground-truth' input potential while the color contours represent the reconstructed probability density for a given value of the potential V (n b ).Two specific estimates of the 'ground-truth' potential are highlighted in the figure besides the posterior distribution of the potential.These are the Maximum A Posteriori (MAP) estimate, which represents the mode of the posterior distribution as evaluated via MCMC and the 'MEAN' estimate as calculated by averaging the values of the sampled potentials at different densities.The comparison of the MAP and the MEAN curves with the 'ground-truth' shows that the reconstruction results from the Bayesian Inference are centered around the 'ground-truth' EoS and the sampling converges indeed to the true posterior.
From the spread of the posterior it can be seen that the EoS in the closure test is well constrained up to densities 4n 0 for the observables used in the present study.For densities from 4n 0 up to 6n 0 the generated EoSs have larger uncertainties.However, the mean potentials follow closely the true potential.
The second closure test is done in order to determine the sensitivity of the inference to the choice of the observational data.Hence, the procedure is similar to the previous test, except that the ⟨m T ⟩ − m 0 values for √ s NN = 3.83 and 4.29 GeV are not used in this test to estimate the posterior.When these two data points are excluded, the agreement of the 'ground-truth' EoS with the MAP and MEAN estimates decreases considerably for densities greater than 4n 0 .This indicates that these data points are crucial indeed for constraining the EoS at higher densities.Further details about these closure tests, and the sensitivity on excluding different data points, can be found in appendices D, E and F. There, also a comparison of the prior and posterior probability distributions is shown to highlight the actual information gain obtained through the Bayesian inference.
Results based on experimental data: The results of sampling the posteriors by using experimental data, for the two cases, with and without the ⟨m T ⟩ − m 0 values at √ s NN = 3.83 and 4.29 GeV, are shown in figure 2. The upper panel corresponds to using 15 experimental data points while the lower panel shows the results without the two ⟨m T ⟩ − m 0 values.The data as used in this paper do well constrain the EoS, for densities from 2n 0 to 4n 0 .However, beyond 4n 0 , the sampled potentials  have a large uncertainty and the variance is significantly larger for the posterior extracted from 13 data points.Beyond densities of about 3n 0 , the posterior extracted using 13 data points differs significantly from the posterior extracted using all 15 points.This is quite different from our closure tests, where the extracted MAP and MEAN curves did not depend strongly on the choice of the data points used.This indicates a possible tension within the data in the context of the model used.
To understand this significant deviation which appears when only two data points are removed, the MAP and MEAN EoS resulting from the two scenarios are implemented into the UrQMD model to calculate the v 2 and the ⟨m T ⟩ − m 0 values which are then compared with the experimental data which were used to constrain them.Figure 3 shows the MAP and MEAN curves together with 1-sigma confidence intervals from the posterior.Both results, with different inputs, fit the v 2 data very well ex-  cept for the small deviation at the high energies.The fit is slightly better when the ⟨m T ⟩ − m 0 values at the lowest energies are removed.At the same time, using all data points results in larger ⟨m T ⟩ − m 0 values for both the MAP and MEAN curves.The bands for ⟨m T ⟩ − m 0 are much broader than the bands for v 2 .Yet, the uncertainty bands clearly support the differences in the fit portrayed by the MEAN and MAP curves.The model encounters a tension between the ⟨m T ⟩ − m 0 and the v 2 data.This tension may either be due to a true tension within the experimental data, or due to a shortcoming of the theoretical model used to simulate both the ⟨m T ⟩ − m 0 and the v 2 data at high beam energies for a given equation of state.It should also be noted that at higher beam energies the contributions from the mesonic degrees of freedom to the equation of state becomes more dominant which may make an explicitly temperature dependent equation of state necessary.
Finally, the extracted EoS can be tested using various observables like differential flow measurements (see appendix G, which include Refs.[51][52][53][54][55]) or different flow coefficients.The slope of the directed flow dv 1 /dy at mid rapidity are calculated using the reconstructed MEAN and MAP EoSs.The results together with available experimental data are shown in figure 4. The dv 1 /dy prediction closely match the experimental data, especially at the higher energies, for the MEAN EoS extracted from all 15 data points.The 1-sigma confidence intervals are indicated as colored bars.It is shown only for one beam energy due to the high computational cost.It can be seen that at high energies, in the 13-points case, the prediction clearly undershoots the data while in the 15-points case, the experimental data lies at the border of the 1-sigma band.The reconstructed EoSs for all other energies are consistent with the dv 1 /dy data though it was not used to constrain the EoSs.
To relate the extracted high density EoS to constraints from astrophysical observations, the squared speed of sound (c 2 s ) at T = 0 is presented for the MEAN EoSs as a function of the energy density in Figure 5, together with a contour which represents the constraints from recent Binary Neutron Star Merger (BNSM) observations [60] [61].The speed of sound, as the derivative of the pressure is very sensitive to even small variations of the potential energy.The c 2 s values estimated from all data points show overall agreement with the c 2 s constraints from astrophysical observations and predicts a rather stiff equation of state at least up to 4n 0 .In particular, both the astrophysical constraints (see also [62]) and the EoS inference in the present work gives a broad peak structure for c 2 s .This is compatible with recent functional renormalization group (FRG) [63] and conformality [64] analyses.However, if only the 13 data points are used, the extracted speed of sound shows a drastic drop, consistent with a strong first order phase transition at high densities [8,9].This is consistent with the softening phenomenon observed for ⟨m T ⟩ − m 0 data shown in Figure .3. In order to give an estimate of the uncertainty on the speed of sound, we have calculated the speeds of sound for 100000 potentials which lie within the 68% credibility interval of the coefficients, however excluding those which lead to acausal equations of state for densities below 4.5 n0.

Conclusion.
Bayesian inference can constrain the high density QCD EoS using experimental data on v 2 and ⟨m T ⟩ − m 0 of protons.Such an analysis, based on HIC data, can verify the dense QCD matter properties extracted from neutron star observations and complements astrophysical studies to extract the finite temperature EoS from BNSM merger signals as well as constrain its dependence on the symmetry energy.
A parametrized density dependent potential is introduced in the UrQMD model used to train Gaussian Process models as fast emulators to perform the MCMC sampling.In this framework, the input potential can be well reconstructed from experimental HIC observables available already now from experimental measurements.The experimental data constrain the posterior constructed in our method for the EoS, for densities up to 4n 0 .However, beyond 3n 0 , the shape of the posterior depends on the choice of observables used.As a result, the speed of sound extracted for these posteriors exhibit obvious differences.The EoS extracted using all available data points is in good agreement with the constraints from BNSMs with a stiff EoS for densities up to 4n 0 and with-out a phase transition.A cross check is performed with the extracted potentials by calculating the slope of the directed flow.Here, a MEAN potential extracted from all 15 data-points gives the best, consistent description of all available data.The inferences encounter a tension in the measurements of ⟨m T ⟩ − m 0 and v 2 at a collision energy of ≈4 GeV.This could indicate large uncertainties in the measurements, or alternatively the inability of the underlying model to describe the observables with a given input EoS.Note, that the data are from different experiments that have been conducted during different time periods.The differences in the acceptances, resolutions, statistics and even analysis methods of experimental data makes it difficult for us to pin down the exact sources of these effects.
Tighter constraints and fully conclusive statements on the EoS beyond density 3n 0 require accurate, high statistics data in the whole beam energy range of 2-10 GeV which will hopefully be provided by the beam energy scan program of STAR-FXT at RHIC, the upcoming CBM experiment at FAIR and future experiments at HIAF and NICA.It is noted that, when approaching higher beam energies, which would be important in extending the constraints to higher temperatures and/or densities, the currently used transport model needs to incorporate further finite-temperature and possible partonic matter effects together with relativistic corrections, which we leave for future studies.Further effort should be put into the development and improvement of the theoretical models to consistently incorporate different density dependent EoSs for the study of systematic uncertainties [65].In future, the presented method can also be extended to include more parameters of the model as free parameters for the Bayesian inference, which would also require more and precise input data.In addition, other observables such as the higher order flow coefficients and v 1 can be incorporated into the Bayesian analysis, if permitted by computational constraints, for a more comprehensional constraint of the EoS in the future.The GP emulators are trained on a set of 200 different parameter sets, each with a different high density EoS and the performance of these models is then validated on another 50 input parameter sets.15 different GP models are trained, each one predicting one of the observables (v 2 for 10 collision energies + ⟨m T ⟩ − m 0 for 5 collision energies).The trained GP models can be evaluated by comparing the GP predictions with the "true" results of UrQMD simulations.The performance of the GP models in predicting the v 2 and ⟨m T ⟩ − m 0 observables for 50 different EoSs in the validation dataset are shown in figures 8 and 9 respectively.As evident in these plots, the GP models can accurately predict the simulated observ-  ables, given the polynomial coefficients.Hence, the GP models can be used as fast emulators of UrQMD during the MCMC sampling.All the posterior distributions presented in this work are constructed by 4 different MCMC chains.Each chain generates 25000 samples after 10000 tuning steps.
Appendix C: The prior In the following we will explain the choice of the prior distributions which is used as starting point of the Bayesian inference.Technically speaking, the prior distribution of parameters θ i are chosen as Gaussian distributions whose means and variances are estimated from the randomly sampled EoSs, under physical constraints, used in the training of the Gaussian Process Emulators.These constraints were introduced to ensure numerically stable results in training the GP models.To create such a robust training dataset, different physics constraints were applied as discussed in appendix B. These constraints eliminate some of the wildly fluctuating and superluminal EoSs from the training data.
To ensure that the prior in the analysis is broad enough to reflect an a priori high degree of uncertainty (i.e., without introducing a bias) the mean and width of the distributions in the constraint GP training where used also in the prior.However, the polynomial coefficients θ i resulting from these constraints, used to construct the prior distributions for the Bayesian inference, are then sampled independently and are thus not correlated as they would be in the GP model training.Thus, the priors for the Bayesian inference are much broader than the distributions used for the GP model training.The means and standard deviations of the Gaussian priors for the polynomial coefficients are shown in the table I.
Regarding the prior for the Bayesian inference, it is important to note that a prior based only on the GP training constraints could also be a good starting point for the parameter estimation but not a necessary one.The physics constraints can disfavor the acausal range for the parameters.However, we employ this range only as a soft constraint in the prior as we use the mean and width of each coefficient independently, thereby the prior is not limited by the correlations between the coefficients from the GP-training set.This results in inferred potentials which can also be outside the training range for the  the various equations of state required to train the Gaussian Process emulator.Once the EoS is constrained, of course, many observables for many beam energies and system sizes can be predicted and compared.We are also planning to make the model available in the future so that all these possibilities can be explored.
In addition to the directed flow, which was shown in the letter, a comparison with recently published HADES data on the differential elliptic flow in Au-Au collisions at E lab = 1.23AGeV [55] is presented here.This comparison of the two different MEAN EoS to HADES data is shown in figure 15.As one can see, the extracted EoSs reproduce the p T dependence nicely up to a proton momentum of 1 GeV.Above this range, the model slightly overestimates the elliptic flow compared to HADES data.The reason for this is likely a small momentum dependence of the potential interaction which is not considered in the present approach.It is however important to note that the integrated elliptic flow is only sensitive to the flow around the maximum of the proton p T distribution which corresponds roughly to p T between 300 and 400 MeV.

Figure 1 .
Figure 1.(Color online) Visualisation of the sampled posterior in the closure test.The color represents the probability for the potential at a given density.The 'ground-truth' EoS used for generating the observations is plotted as black solid line.The red dashed and orange dot-dashed curves are the MAP and MEAN EoS for the posterior.

Figure 2 .
Figure 2. (Color online) Posterior distribution for the EoS inferred using experimental observations of v2 and ⟨mT ⟩−m0.The top figure is the posterior when all 15 data points were used while the bottom figure is obtained without using the ⟨mT ⟩ − m0 values for √ sNN= 3.83 and 4.29 GeV.The MAP and MEAN EoSs in both cases are plotted in red dashed and orange dot-dashed curves respectively.The vertical, grey line depicts the highest average central compression reached in collisions at √ sNN=9 GeV.The CMF EoS is plotted in violet for density below 2n0.

Figure 3 .
Figure 3. (Color online) v2 and ⟨mT ⟩ − m0 values from UrQMD using the MEAN and MAP EoS as extracted from measured data.The observables for both MAP and MEAN EoSs, extracted by using all 15 data points are shown as solid and dashed red lines respectively, while those generated using only the 13 data points are shown as solid and dashed black lines respectively.The experimental data are shown as blue squares.The uncertainty bands correspond to a 68 % credibility constraint constructed from the posterior samples

Figure 4 .Figure 5 .
Figure 4. (Color online) Slope of the directed flow, dv1/dy, of protons at mid rapidity.The experimental data [37-39, 55-59]are shown as blue squares.The colored bars correspond to a 68% credibility constraint constructed from the posterior samples.

Figure 7 .
Figure 7. (Color online)Visualisation of the v2 and ⟨mT ⟩−m0 for 50 random EoSs from the training data.The upper plot is the v2 and the lower plot is the ⟨mT ⟩ − m0 as a function of √ sNN.The experimental measurements are plotted in blue squares while the gray lines are from the training EoSs.

Figure 9 .Figure 15 .
Figure 9. (Color online) Performance of the Gaussian Process models in predicting the ⟨mT ⟩ − m0 for 5 different collision energies.The predictions are shown in blue while the black, dashed line depicts the true= predicted curve.

Table I .
Means (µ) and standard deviations (σ) of the Gaussian priors for the seven polynomial coefficients (θi).