PYTHIA 8 underlying event tune For RHIC energies

We report an underlying event tune for the PYTHIA 8 Monte Carlo event generator that is applicable for hadron collisions primarily at $\sqrt{s}$ ranges available at the Relativistic Heavy-Ion Collider (RHIC). We compare our new PYTHIA 8 tuned predictions to mid-rapidity inclusive $\pi^{\pm}$ spectra, jet sub-structure, Drell-Yan production, and underlying event measurements from RHIC and the Tevatron, as well as underlying event data from the Large Hadron Collider. With respect to the default PYTHIA 8 Monash Tune, the new `Detroit' tune shows significant improvements in the description of the experimental data. Additionally, we explore the validity of PYTHIA 8 predictions for forward rapidity $\pi$ in $\sqrt{s}$ = 200 GeV collisions, where neither tune is able to sufficiently describe the data. We advocate for the new tune to be used for PYTHIA 8 studies at current and future RHIC experiments, and discuss future tuning exercises at lower center-of-mass energies, where forward/backward kinematics are essential at the upcoming Electron-Ion collider.


I. INTRODUCTION
Monte Carlo (MC) event generators that simulate relativistic lepton-lepton, lepton-hadron, and hadron-hadron collisions are an essential part of high-energy particle and nuclear physics. From a theoretical perspective, MC models are able to test our fundamental understanding of the Standard Model, in particular Quantum Chromodynamics (QCD), and offer a prescription for the initial and final states of the collision system. From the experimental point of view, MCs are an integral piece in the simulation chain that aims to reproduce realistic spectra that are ultimately used to extract detector acceptance and resolution corrections, and study systematic effects in experimental data. There are several event generators that are currently available, for example PYTHIA [1][2][3] and Herwig/Herwig++ [4,5], that have been widely used to simulate collisions at the Large Hadron Collider (LHC) and the Relativistic Heavy-Ion Collider (RHIC). The simulation routines include various QCD physics processes that factorize a single collision into two regimes consisting of the perturbative hard scattering and evolution via a parton shower, and various nonperturbative components such as hadronization, underlying event and multi-parton interactions.
Physics processes implemented in event generators include multiple parameters which are turned to experimental measurements, often from e + + e − collisions. The PYTHIA event generator studied in this publication has been very successful in describing data at the LHC and has been the subject of many tuning exercises over the past decades [6][7][8][9][10][11][12][13]. While the global tuning of PYTHIA 6 and 8 in Refs. [6,7] are in good agreement with data at LHC energies, there are significant discrepancies in describing data from collisions at lower center-of-mass energies [14,15]. These disagreements can mainly be understood as a consequence of incorrect modeling of the soft QCD underlying event (UE) stemming from the center-of-mass energy extrapolation that is used. A PYTHIA 6 "STAR" tune was produced in Ref. [16] which updated the value of the parameter controlling the energy extrapolation of the low transverse momentum (p T ) cross section, and was able to adequately describe the π ±,0 p T and jet p T spectrum in proton-proton (p+p) collisions at √ s = 200 GeV from STAR and PHENIX as demonstrated in Refs. [14,17]. For PYTHIA 8, there has been several tuning exercises with LHC data at √ s = 7 and 13 TeV collisions, and some lower collision energy data either from the LHC or the Tevatron [9,10,12]. The LHC-focused tune produced in [10] utilized LHC data from √ s = 900 GeV and 7 TeV p+p collisions, and it was argued therein that including Tevatron data within the tuning procedure leads to inadequate tune performance at LHC energies. In contrast, measurements of the UE multiplicity from CDF [18] at √ s = 900 and 1960 GeV and LHC data at the top energies √ s = 7 and 13 TeV were used in this [12] tuning exercise to control the impact of energy extrapolation. However, the aforementioned tune was not able to describe the CDF UE data at √ s = 300 GeV. The energy dependence of low p T regularization was also studied in [9] using the same LHC data and CDF data at √ s = 300, 900, and 1960 GeV, and concluded that the inclusion of a new term in the extrapolation produced improved agreement at lower energies. All these observations motivate the need for a dedicated PYTHIA 8 tune, in addition to the existing PYTHIA 6 STAR tune, for the nominal RHIC energy of √ s = 200 GeV.
The organization of the following sections is as follows: section II describes the general tuning procedure via the Professor toolkit [11]; section III is dedicated to describing the data and RIVET [19] implementation; section IV presents our tuned results; section V compares the new PYTHIA 8 predictions to selected data distributions; in Section VI we compare the default and tuned distributions at forward rapidity; and section VII summarizes our results. The Appendix A presents the comparison of our new tune to all the experimental data.

II. TUNING PROCEDURE
We utilize a parameterization-based approach for the tuning procedure provided by the Professor (v2.3.3) [11] toolkit. In the Professor tuning methodology, PYTHIA parameters of interest are sampled n times across a provided range and for each sampling a MC generation is produced and the resulting prediction is compared to data. In each bin of data, the result of the random samplings are parameterized by a third order polynomial. In the case of N PYTHIA tuning parameters, the corresponding polynomials are N -dimensional. The coefficients of the polynomials are computed numerically within the Professor code. A χ 2 fit of the polynomial parameterizations to the data is performed using Minuit [20,21] to determine the best values of each tuning parameter.
The starting point of our tuning exercise is the default Monash tune [7] and PYTHIA 8.303. The NNPDF2.3 parton distribution functions (PDF) [22] used in the Monash tune have since been updated with improved data and methods, and therefore we utilize the recent NNPDF3.1 leading-order PDF set with α s (m Z ) = 0.130 [23]. The settings that are varied in our tune are p Ref T,0 and its energy-dependence scaling parameter, the proton matter distribution parameters, and color reconnection range. The p Ref T,0 parameters regularizes the low p T cross section divergence, and is a key parameter in all PYTHIA tunes. The energy-scaling of p T,0 follows a power-law function, and is controlled by the ecmPow parameter. We change the reference energy that corresponds to the p Ref T,0 parameter to be 200 GeV. We do so for two reasons: 1) using a different PDF set requires a complete re-tune of p Ref T,0 ; and 2) to control the power-law extrapolation as much as possible at low center-of-mass energies due to the rapidly varying functional form in this region. For the proton shape function, we change it to the double Gaussian matter profile (MultipartonInteractions:bProfile=2) and vary it's two respective parameters, coreRadius and coreFraction. Table I tabulates all the tuning parameters and their respective ranges. From the previous PYTHIA 6 "STAR" tune study in [16], the values for p Ref T,0 and ecmPow are expected to be smaller than the default values, and therefore their ranges are chosen to cover all values lower than the default with some overlap. The other tuning parameter ranges cover all possible values. A complete description of the above tuning parameters are given in [1][2][3]  We sample 70 values of the tuning parameters in Table I within the specified ranges using the Professor package to be used as anchor points in the generator response polynomial parameterization. We have checked that using a smaller fraction (but still sufficient for 5 tuning parameters) of the sampled anchor points produces compatible results with respect to the full sample. We generate 10 million events for each simulation run to ensure the MC statistics are sufficiently less than that of the data in the region of interest. The tuned results are determined by minimizing the weighted χ 2 , where the index i runs over all data points d i , w i is a weight for each data point, F [i] is the parameterized PYTHIA prediction for the i-th data point, and C −1 is the inverse covariance matrix which contains only experimental uncertainties (assumed no bin-by-bin correlations). For the final fit we do not assign any weights beyond unity to any of the data.

III. INPUT DATA AND MC GENERATION
Mid-rapidity data from STAR, PHENIX, and CDF are utilized in this tuning exercise. The measurements can be grouped in three main categories; identified particle spectra, event multiplicities, and jet substructure as described in Table II. Collectively, they broadly sample the available phase space and test the model's ability to describe the nonperturbative and perturbative parts of a p+p collision. The spectra measurements focus on π ± yields at STAR [24] and Drell-Yan di-muon pair production from PHENIX [25]. We include measurements of the UE event multiplicity from STAR [15] and CDF [18] at various regions such as towards, away, and transverse with respect to the trigger object, be it a reconstructed jet or charged hadron, respectively. Lastly, two sets of jet sub-structure measurements from STAR [26, 27] on the SoftDrop splitting observables at the first split and the invariant and groomed jet mass, all measured differentially as a function of the jet p T and jet resolution parameter R, are included in the tuning exercise. The substantial impact on jet substructure observables from the variation of the MPI and UE parameters is an indication of the multi-faceted inner workings of the PYTHIA event generator. The last column of Table II mentions the respective figure in both the paper draft and in the appendix where the comparisons with the data are shown. We prepared RIVET analyses for each of the measurements mentioned above (when analyses were not publicly available) and they are all made available here github.com/star-bnl/star-pythia8-tune.
The various MC runs are all analyzed by the RIVET analyses and the resulting output yoda files are processed by the Professor toolkit to determine the minimum χ 2 . We exclude the π ± cross section below p T < 1 GeV/c to avoid potential feed-down effects which we studied by turning weak decays on or off. We also remove data from the fit in which the envelope of the PYTHIA predictions are significantly varied compared to the data systematic uncertainties. Similarly, data points where our MC statistical uncertainties are still too large, for example the very high p T bins, are also excluded to optimize the tuning but we note that it is only applicable to a few regions of phase space of our observables. We explicitly turn off long-lived particle decays in our MC generation as all relevant observables have either been corrected for feed-down decays or unfolded to particle level distributions.
IV. NEW PYTHIA 8 DETROIT TUNE Figure 1 show the χ 2 profiles projected onto each tuning parameter in the vicinity of the global minimum (offset to have minimum at zero), and the one sigma correlation contours between all tuning parameters. The central values of the tuning parameters are tabulated in Table III, and define the new 'Detroit' tune. In general the errors on the central values from Minuit are much smaller than the eigentune values quoted below, and are therefore not given. The χ 2 per degrees of freedom (n.d.f.) for the best fit is χ 2 /n.d.f. = 611/493. CR:range  In Fig. 2 we plot the values for p Ref T,0 , at their respective reference energies, and their extrapolations for the Monash, CMS CP1 [12], and Detroit tunes. The extrapolations are determined as to the CMS CP1 tune as it followed a similar tuning strategy utilizing both LHC data and the CDF UE data at √ s = 900 and 1960 GeV, and the same PDFs. However, in the CMS CP1 tune the reference energy is kept at 7 TeV. The new p Ref T,0 value at a reference energy of √ s = 200 GeV is about 30% larger compared to the √ s = 200 GeV extrapolated p T,0 using the Monash values, and comparable to the extrapolated values in the CMS CP1 tune. However, in the latter comparison the values of p T,0 diverge as you go up or down in collision energy. The value for the energy-depended extrapolation reduced by almost 40% compared to the Monash tune, which significantly reduces how fast p T,0 varies. Compared to the CMS CP1 tune, our new extrapolation parameter is roughly 10% smaller. We observe that the proton overlap function parameters coreRadius and coreFraction, and color reconnection range to have slightly increased values in our tune compared to the default PYTHIA 8 and CMS CP1 values as  presented in Table III. We have also performed the tuning study with the default proton overlap shape function (MultipartonInteractions:bprofile=3) and find the description of the √ s = 300 GeV UE data to be inadequate, and had a global χ 2 more than a factor of two larger with respect to the Detroit tune. We also provide a set of 'eigentunes' that quantify the MC errors that may be used for systematic studies. This was done using the Professor package which diagonalizes the covariance matrix at the best fit point, and provides values for the tuned parameters that correspond to deviations along the principle directions for a fixed tolerance ∆χ 2 . For our analysis, we choose a heuristic ∆χ 2 =n.d.f./2 as done in Ref. [13].  Figure 3 shows the comparison of the PYTHIA 8 Monash and Detroit tunes for a representative sample of measurements in data. All others are shown in Appendix A. The mid-rapidity π + cross section on the left, the average UE transverse charge particle multiplicity in the middle, and the groomed jet mass distribution on the right for √ s = 200 GeV p+p collisions. We find, in general, the Detroit tune provides a better match to the data than the Monash tune, in particular the underlying event charged particle multiplicity. It is especially relevant to highlight that reducing the π cross section at low momenta directly translates to a significantly better description of the UE multiplicity and the jet mass, especially in the tails of the distribution. In √ s = 200 GeV p+p collisions, the π − spectrum is well described by the Detroit tune across the entire measured p T range. The π + spectrum is consistent except still being slightly over predicted above p T = 5 GeV/c by about 10% considering experimental uncertainties. The charged particle multiplicities in the toward, away, and transverse regions are well described by the Detroit tune across all leading jet p T . The average charge particle p T versus leading jet p T for the toward, away, and transverse regions changed only slightly in the Detroit tune with respect to the Monash tune. In the transverse region the description of the data is slightly improved and in the toward and away regions slightly degraded by a few percent. For the average p T for tracks with p T > 0.5 GeV/c in the transverse region, we observe both the Detroit and Monash tunes undershoot the data at low leading jet p T by around 8% and improve at higher leading jet p T . The Drell-Yan di-muon data is well described by the Detroit tune where as the Monash tune slightly under predicted the data. In all the jet-substructure observables we see an improved agreement with the data using the Detroit tuned PYTHIA, but observe some persistent discrepancies in the tails of the distributions.
We observe that the Detroit tune is able to describe the CDF charge particle multiplicities in all √ s = 300, 900, and 1960 GeV center-of-mass energies, as shown in Appendix A. The charged particle p T sum is well described at √ s = 1960 GeV. In the √ s = 300 and 900 GeV charged particle p T sum observables the Detroit tune slightly under predicts the data in the leading particle p T range of around 4 to 8 GeV/c, but is less than 10% considering experimental uncertainties.  We compare our tune with the Monash and CMS CP1 tunes to underlying event observables measured by the CMS experiment in data from 7 [28] and 13 TeV [29] p+p collisions in Figs. 4 and 5, respectively (additional 13 TeV comparisons in Appendix A). We note for all these comparisons, we have explicitly turned off long-lived decays in the MC generation. In the 7 TeV data, the Monash tune is able to describe the data better than the Detroit tune and the CMS CP1 tune. At higher leading track-jet p T , the Detroit tune is consistent with the data and the CMS CP1 tune slightly under predicts the data. At 13 TeV, our tune is able to describe the data above a leading track or jet p T of 5 and 10 GeV, respectively. The Monash tune is able to describe the charged particle density in the same regions, but under predicts the p T sum. The CMS CP1 tune under predicts both. We note that in the low p T regions, the predictions from our tune vary more in shape with respect to the Monash tune, and is due to the proton shape function used.

VI. COMPARISONS AT FORWARD RAPIDITY
We compare the Detroit and Monash PYTHIA 8 tunes to the measured π ± cross sections at rapidity y = 2.95 and 3.3 measured by the BRAHMS experiment [30] and the inclusive π 0 cross sections in 3.4< η <4.0 measured by the STAR experiment [31] in √ s = 200 GeV p+p collisions. As shown in Fig. 6 and 7 for BRAHMS and STAR data, respectively, both the Monash and Detroit tunes undershoot the data in the measured low and mid p T /E π ranges, and are consistent or exceed the data at high p T /E π . In the case of the Detroit tune, this discrepancy is persistent even considering the envelope covered by the eigentunes in Table IV. An interesting observation is that the Detroit tune does worse compared to the standard tune. We have attempted simultaneous tuning of mid-and forward-rapidity data, but are unable to recover the agreement seen for the mid-rapidity tune. The forward rapidity data favors larger values of p Ref.
T,0 compared to the Monash value to induce a better MC agreement with the data. We additionally have expanded our simultaneous tuning exercise to include the initial state radiation parameters α s , intrinsic k T , and (SpaceShower) p Ref.
T,0 , and utilizing a different proton shape parameterization (MultipartonInteractions:bProfile=3) with its associated parameter. Even with the inclusion of these additional tuning parameters and an enhanced tunable phase-space, we are still unable to reach a satisfactory agreement with both mid-and forward-rapidity data.  Having a PYTHIA tune that is able to simultaneously describe the physics at both mid-and forward-rapidity regions will be essential for the upcoming RHIC program. The STAR experiment has installed a new detector system in the forward region (2.5< η <4) for the 2022 Run [32]. Electromagnetic and hadronic calorimeters, as well as silicon trackers, will allow for full jet reconstruction for the first time in the far forward region. This will open up significantly enhanced opportunities for forward physics and mid-forward rapidity correlation studies with respect to previous data. Additionally, with the Electron-Ion collider program on the horizon spanning a broad kinematic range and excellent detector capabilities across rapidities, event generators will need to describe physics at the level of the experimental precision in order to advance the discovery potential. A forward tuning exercise that will aim to address these points will be the focus of future studies.

VII. CONCLUSION
This publication represents the first UE tuning exercise of the PYTHIA 8 generator utilizing data in proton-proton collisions at center-of-mass energies down to 200 GeV. We have provided a new UE tune, which we name the 'Detroit' tune, and a set of eigentunes for the PYTHIA 8 event generator that utilizes data from √ s= 200, 300, 900, and 1960 GeV hadron-hadron collisions. We observe that the default Monash tune, that was tuned to data at LHC energies, significantly deviates from STAR, PHENIX, and CDF measurements. This is due to the energy-scaling of the PYTHIA p T,0 regularization parameter not being appropriate. Our re-tuning of this parameter, and others related to the MPI, show a significant improvement in the simultaneous description of the data across all center-of-mass energies that are studied. We additionally compared our tune at LHC energies and find agreement with the UE data at higher p T . This agreement is either comparable or better than the default Monash tune. At low p T there are large deviations with respect to the data, and are driven by the different proton shape function used. We advocate for using this tune for PYTHIA 8 studies at both current (STAR) and future (sPHENIX) RHIC experiments for the higher statistics running period planned for 2022-2025. Also, there are possibilities for sequential re-tunings of PYTHIA 8 as more data becomes available at RHIC energies, particularly at forward rapidities, but also to additionally improve jet-substructure observables. Therefore, our tune serves as a natural starting point for these future exercises. Our tuning study encourages the authors of the respective MC models to take into account data at a variety of collision energies towards a universal parameterizations of physics leading to improvements across varied kinematics.