Measurements of µµ pairs from open heavy ﬂavor and Drell-Yan in p + p collisions at √

PHENIX reports diﬀerential cross sections of µµ pairs from semileptonic heavy-ﬂavor decays and the Drell-Yan production mechanism measured in p + p collisions at √ s = 200 GeV at forward and backward rapidity (1 . 2 < | η | < 2 . 2). The µµ pairs from c ¯ c , b ¯ b , and Drell-Yan are separated using a template ﬁt to unlike- and like-sign muon pair spectra in mass and p T . The azimuthal opening angle correlation between the muons from c ¯ c and b ¯ b decays and the pair- p T distributions are compared to distributions generated using pythia and powheg models, which both include next-to-leading order processes. The measured distributions for pairs from c ¯ c are consistent with pythia calculations. The c ¯ c data presents narrower azimuthal correlations and softer p T distributions compared to distributions generated from powheg . The b ¯ b data are well described by both models. The extrapolated total cross section for bottom production is 3 . 75 ± 0 . 24(stat) ± 0 . 35 0 . 50 (syst) ± 0 . 45(global)[ µ b], which is consistent with previous measurements at the Relativistic Heavy Ion Collider in the same system at the same collision energy, and is approximately a factor of two higher than the central value calculated with theoretical models. The measured Drell-Yan cross section is in good agreement with next-to-leading-order quantum-chromodynamics calculations.


I. INTRODUCTION
Lepton pair spectra are a classic tool to study particle production in collisions of hadronic beams. Famous discoveries using lepton pairs include the Drell-Yan mechanism for lepton pair production [1] and the J/ψ meson [2].
In this paper, we focus on the contribution of cc and bb decays to the lepton pair continuum above a mass of 1 GeV/c 2 . In recent years, measurements of cc and bb via the lepton pair continuum have been reported for various collisions systems at the Relativistic Heavy Ion Collider (RHIC) by the PHENIX [3][4][5][6][7] and STAR [8] Collaborations. So far these measurements have been limited to e + e − pairs at midrapidity. Now PHENIX adds a new measurement of the µµ pair continuum at forward rapidity obtained in p+p collisions at √ s = 200 GeV. With these data the contributions from cc and bb decays and the Drell-Yan production mechanism can be separated and used to determine their differential cross sections as function of pair mass, p T and opening angle.
Measurements of cc and bb in p+p collisions are important to further our understanding of the cc and bb production process, which despite considerable experimental and theoretical effort remains incomplete. Significant differences persist between data and perturbative-quantumchromodynamics (pQCD) based model calculations [9][10][11][12][13][14]. Single p T spectra of charm and bottom mesons, as well as their decay leptons have been measured over a wide range of beam energies and rapidity. For charm production, precise measurements at RHIC [15][16][17], Tevatron [18] and the Large Hadron Collider (LHC) [19][20][21][22] indicate that pQCD calculations underestimate the charm cross section, even when contributions beyond leading order are taken into account [9,10,12,13]. For bottom production, the case is less clear. At RHIC, the bottom cross section has been measured via various channels by PHENIX [7,23,24] and STAR [25]. The measured bottom cross sections also tend to be above pQCD predictions, albeit with relatively large uncertainties. At higher energies, the bottom cross sections measured by D0 at √ s = 1.8 TeV [26], ALICE at √ s = 2.76 and 7 TeV [27], and ATLAS at √ s = 7 TeV [28] again tend to be above pQCD predictions, while similar measurements from CDF at √ s = 1.8 TeV [29], CMS at √ s = 7 * PHENIX Spokesperson: akiba@rcf.rhic.bnl.gov † Deceased TeV [30] and LHCb at √ s = 7 and 13 TeV [31] do not demonstrate significant deviations from pQCD.
Studying the angular correlation between the heavy flavor quarks, or their decay products, provides additional constraints on theoretical models and may help to disentangle different heavy flavor production mechanisms. Measurements at the Tevatron [32] and LHC [33,34] can be reasonably well described by next-to-leadingorder (NLO) pQCD calculations. At RHIC, dilepton measurements at midrapidity [3,5,7] can also be reproduced by different pQCD models in the measured phase space, but extrapolations beyond the measured range are model dependent, in particular for cc production.
Besides the interest in the production mechanism itself, a solid understanding of cc and bb production in p+p collision is needed as a baseline for measurements involving nuclear beams, where deviations from the p+p baseline are often interpreted as evidence for hot or cold nuclear matter effects. In collisions with nuclei, modifications to the parton distribution functions, typically expressed as shadowing or anti-shadowing, may need to be taken into account. Also modifications in the final state, incorporated through changes to the fragmentation functions may need to be considered. It is broadly expected that in asymmetric collision systems like p+A or d+A, deviations from the p+p baseline indicate such cold nuclear matter effects. Uncertainties on cc and bb production in p+p limit the precision on the quantification of cold nuclear matter effects. For example, previous dilepton correlation studies indicated a significant modification of heavy flavor yields at forward-midrapidity in d+Au collisions [35], but not at mid-midrapidity [7]. In addition, in heavy-ion collisions the charm contribution is an important background to possible thermal dilepton radiation from the Quark Gluon Plasma [4,6,8]. Current uncertainties in our understanding of cc and bb production prohibit this measurement at RHIC energies.
In this study we make use of the fact that muon pairs from cc and bb decays and from Drell-Yan production contribute with different strength to the muon pair continuum in different phase-space regions for µ + µ − and µ ± µ ± charge combinations. Neither cc decays nor Drell-Yan production contribute to µ ± µ ± pairs. In contrast, bb decays do. As illustrated in Fig. 1, µ ± µ ± muon pairs from bottom arises from two separate mechanisms, (i) from a combination of B → µ and B → D → µ decay chains [36] or (ii) from decays following B 0B0 oscillations [37]. These two contributions dominate the high mass µ ± µ ± spectrum, which allows a precise measurement of the bottom cross section. At midrapidity the e + e − pair continuum is dominated by pairs from heavy flavor decays in the measurable range from 1 to 15 GeV/c 2 [7], and thus having established the bb contribution would be sufficient to extract the cc cross section. However, at forward rapidity, µ + µ − pairs from Drell-Yan can not be neglected. The Drell-Yan process involves quark-antiquark annihilation [38], whereas heavy flavor production is dominated by gluon fusion [11]. Due to the relative large Bjorken-x of valence quarks compared to gluons, at forward rapidity the µµ pair yield above a mass of 6 GeV/c 2 is dominated by pairs from the Drell-Yan process. Thus, the Drell-Yan contribution can be determined from µ + µ − pairs at high masses.
Once the contributions from bb decays and Drell-Yan production are constrained, the yield from cc can be measured in the mass range from 1 to 3 GeV/c 2 , where it is significant, but only one of multiple contributions to the total yield in the mass range.
The paper is organized as follows: Sec. II outlines the experimental apparatus and the relevant triggers. Sec. III describes the procedure to extract muon pairs from the data. The expected µµ pair sources are discussed in Sec. IV. The Monte Carlo simulation used to generate templates for µµ pair spectra from the expected sources, which can be compared to the data, are presented in Sec. V. In Sec. VI we document the iterative template fitting method used to determine cc, bb and Drell-Yan cross sections. Sec. VII discusses the sources of systematic uncertainties. The results are presented in Sec. VIII and finally we summarize our findings in Sec. X.

II. EXPERIMENTAL SETUP
The PHENIX detector comprises two central arms at midrapidity and two muon arms at forward and backward rapidity [39]. The configuration of the experiment used for data taking with p+p collisions in 2015 is shown in Fig. 2. Two muon spectrometers cover ∆φ = 2π in azimuth and −2.2 < η < −1.2 (south arm) and 1.2 < η < 2.4 (north arm) in pseudorapidity. The central arms are not used in this analysis. Each muon arm comprises a forward-silicon vertex tracker (FVTX), followed by a hadron absorber with a muon spectrometer behind it. The spectrometer is composed of a charged particle tracker (MuTr) inside a magnet and a muon identification system (MuID). The FVTX allows for precision tracking, but has limited acceptance and is thus not used in this analysis.
The hadron absorber is composed of layers of copper, iron, and stainless steel, corresponding to a total of 7.2 interaction lengths (λ I ). The absorber suppresses muons from pion and kaon decays by about a factor of 1000, as it absorbs most pions and kaons before they decay. A small fraction of pions and kaons decays before they reach the absorber, which starts about 40 cm away from the nominal interaction point.
The MuTr has three stations of cathode strip chambers and provides a momentum measurement for the charged particles remaining after the absorber. The MuID is comprised of five alternating planes of steel absorbers [4.8 (5.4) λ I for south (north) arm] and Iarocci tubes (gap 0-gap 4). The MuID provides identification of charged-particle trajectories based on the penetration depth. Only muons with momentum larger than 3 GeV/c can penetrate all layers of absorbers. Signals in multiple MuID planes are combined to MuID tracks, which are used in the PHENIX trigger system to preselect events containing muon candidates. The trigger used to select the event sample for this analysis is a pair trigger (MuIDLL1-2D). For muon pairs with tracks that do not overlap in the MuID the MuIDLL1-2D is fired if both tracks independently fulfill the single track trigger requirement (MuDLL1-1D), which requires that the MuID track has at least one hit in the last two planes. A more detailed description of the PHENIX muon arms can be found in Ref. [40].
The beam-beam counters (BBC) [41] comprise two arrays of 64 quartzČerenkov detectors located at z = ±144 cm from the nominal interaction point. Each BBC covers the full azimuth and the pseudorapidity range 3.1 < |η| < 3.9. The BBCs are used to determine the collision-vertex position along the beam axis (z vtx ) with a resolution of roughly 2 cm in p+p collisions. The BBCs information also provides a minimum-bias (MB) trigger, which requires a coincidence between both sides with at least one hit on each side. The cross section of inelastic p+p collisions at √ s = 200 GeV measured by the BBC, which is determined via the van der Meer scan technique [42] (σ p+p BBC ), is 23.0±2.2 mb.

A. Data set and event selection
The data set analyzed here was taken with p+p collisions at √ s = 200 GeV in 2015. The data were selected with the µµ pair trigger (MuIDLL1-2D) in coincidence with the MB trigger. Each event in the sample has a reconstructed vertex within z = ±30 cm of the nominal collision point. The data sample corresponds to 1.2 × 10 12 MB events or to an integrated luminosity of Ldt = 51 pb −1 .

B. Track reconstruction
Each reconstructed muon track comprises a combination of a reconstructed tracklet in the MuTr and in the MuID. A number of quality cuts are applied to reduce the number of background muons from light hadron decays. They are summarized in Tab. I. The tracklet in the MuTr must have a minimum of 11 hits and a χ 2 /N DF smaller than 15 (20) for the south (north) arm. The MuID tracklet has to penetrate to the last gap and must have at least 5 associated hits. MuID tracklets with χ 2 /N DF larger than 5 are rejected. MuTr tracklets are projected to MuID gap 0. We apply cuts on the distance between the projection of the MuTr tracklet and the MuID tracklet (DG0) and the difference between the track angles (DDG0). Figure 3 depicts DG0 and DDG0 distributions for muons with momenta of 4 to 5 GeV/c from µµ pairs in the mass region 2.8-3.4 GeV/c 2 where µµ pairs from J/ψ dominate the yield. Both distributions are compared to tracks from simulated J/ψ decays. These cut variables are well described by simulations. We apply a cut at 3σ (99.87% efficiency) of the momentum dependent matching resolution of signal tracks determined from Monte Carlo simulations with geant4 [43].
In addition to the basic track quality cuts, we enforce that the momentum of all reconstructed muon tracks are within 3 < p [GeV/c] < 20 and that their rapidity to be 1.2 < |η| < 2.2. These requirements limit effects from detector acceptance edges. The upper limit on p removes tracks from hadronic decays within the MuTr volume that lead to a mis-reconstructed momentum. We also require that all tracks satisfy the MuIDLL1-1D trigger condition.
While traversing the hadron absorber muons undergo multiple scattering and lose typically 2 GeV of their energy before they reach the MuTr, where the momentum of the track is determined. Thus, the momentum needs to be corrected to correspond to the momentum in front of the absorber. The relative resolution has two main components, the intrinsic resolution of the MuTr and the resolution of the energy loss correction. Below 10 GeV/c the resolution depends only moderately on rapidity or momentum and is approximately constant between 3.5 and 5%. Towards larger momenta it gradually increases but remains below 10% for all momenta considered in this analysis (p < 20 GeV/c). Multiple scattering in the absorber adds an uncertainty of 160 mrad on the angular measurement from the MuTr. This can be vastly improved with the FVTX, which measures the track in front of the absorber. However, as discussed in the following section we do not make use of this improvement in the current analysis.

C. Muon pair selection
All muon tracks in a given event are combined to pairs and their masses and momenta are calculated. The mass is calculated from a fit to the two tracks with the constraint that both originate at a common vertex within the range ±40 cm around the nominal event vertex. This fitting procedure improves the resolution of the opening angle of the pair, which in turn significantly improves the mass resolution at m < 3 GeV/c 2 where the mass resolution is dominated by effects from multiple scattering. We achieve a mass resolution σ m /m ≈ 12.6%, 7.4%, 5.7% at m = 1.02, 3.10, 9.46 GeV/c 2 corresponding to the φ, J/ψ and Υ(1S) respectively, which is sufficient for the analysis of the µµ pair continuum.
The mass resolution could be further improved by constraining the fit to the measured vertex position. However, our data set contains on average 22% of pileup events with two collisions recorded simultaneously. For these events only an average vertex position can be measured, which is often off by tens of centimeters from one or both of the collision points. This leads to µµ pair masses reconstructed hundreds of MeV/c 2 different from the true mass and results in a mass resolution function with significant non-Gaussian tails. Figure 4(a) compares the mass distribution of the south muon arm and Fig. 4(b) for the north arm. The mass is calculated from the fits that constrain the tracks to originate from a vertex located at (i) ±40 cm of the nominal vertex (mass nominal ), and (ii) ±2 cm of the measured vertex using the BBC (mass BBC ). Although the width of the J/ψ is narrower for mass BBC as expected, the yield at the continuum on either sides of the J/ψ is significantly different for the two mass calculations. To further diagnose this issue, we selected pairs with mass BBC between 1. pared mass BBC and mass nominal distributions. In both mass BBC selections, a clear J/ψ peak is observed for mass nominal , which indicates that the mass BBC continuum contains a significant fraction of mis-reconstructed J/ψ mesons, where the mis-reconstructed mass is due to a mis-measured vertex using the BBC in pileup events.
To avoid this undesirable complication of the analysis of the µµ pair continuum, we do not make use of the improvement of the mass resolution. The pileup events increase the yield of µµ pairs per event by about 10%, this is taken into account in the normalization procedure.
We apply additional quality cuts to the muon pairs,  which are summarized in Table II. The χ 2 vtx , computed from the simultaneous fit of the two muon tracks, must be less than 5. This cut mainly removes tracks that were either scattered by large angles in the absorber or that resulted from light hadron decays. We also remove pairs with a momentum asymmetry (|p 1 − p 2 |/|p 1 + p 2 |) larger than 0.55 because these pairs are mostly from random pairs where one hadron has decayed into a muon inside the MuTr and is mis-reconstructed as a higher momentum track, thus yielding a fake high mass pair.
Finally, we impose cuts to ensure spatial separation between two tracks in the MuTr and MuID volumes. Specifically we require that the vertical and horizontal spatial separation of the two tracks at the MuID gap 0 exceeds 20 cm. This cut removes all pairs with tracks that overlap so that for the remaining pairs the pair reconstruction and trigger efficiencies factorize into a product of single track efficiencies. Figure 5 shows the raw mass spectra after imposing all single and pair cuts. Spectra are presented for µ + µ − and µ ± µ ± pairs measured for collisions in three vertex regions separately for the south and north arms.
The most prominent feature in the spectra is the J/ψ peak at ∼ 3.1 GeV/c 2 . For each arm the yield is independent of z within 10%-20%. Pairs in the north arm are reconstructed with about 2/3 of the efficiency compared    to the south arm, which is mostly due to a larger dead area in the north MuTr, but otherwise the spectra are similar for mirrored z ranges. The like-sign spectra have the lowest yield for the z range closest to the absorber, negative and positive z for south and north arm, respectively. The µ ± µ ± yield increases by roughly a factor of three as the collision point moves away from the absorber and more pions and kaons decay in flight before reaching the absorber.

IV. EXPECTED PAIR SOURCES
To interpret the experimental data shown in Fig. 5, we need to compare it to the µµ pairs from known sources, commonly referred to as "cocktail". Besides our signal of interest, µµ pairs from open heavy flavor (semi-leptonic decays of cc and bb) and Drell-Yan, the cocktail contains large contributions from hadron (pseudoscalar and vector meson) decays, and unphysical background pairs. The quantitative comparison is done through template µµ pair distributions that are generated for the individual known sources.
The unphysical background pairs typically involve muons from the decays of light hadrons (π ± , K ± , and K 0 ). The production rates of decay muon from light hadrons overwhelm those of signal muons from cc, bb, and Drell-Yan. Therefore, in spite of the large hadron rejection power (∼ 1/1000) of the muon arms, a substan-tial fraction of the reconstructed muons are from pion and kaon decays that occur before they reach the absorber. Because the distance to the absorber varies from 10 to 70 cm, depending on the z location of the event vertex z vtx , the unphysical background varies significantly with z vtx . A smaller, but non negligible fraction of background tracks are hadrons that penetrate all layers of absorber and are therefore reconstructed as muon candidates. In addition, hadrons can interact strongly with the absorber to produce showers of secondary particles, which can also be reconstructed as muon candidates. Pairs including at least one of these so called hadronic tracks, i.e. a muon from light hadron decay, a punch-through hadron or a secondary particle from hadronic showers, are a large contribution to the measured µµ pairs.
In the following subsections we discuss how we can generate the known sources of µµ pairs, which are needed as input for the templates of µµ pair spectra used in the subsequent analysis.
The mesons ρ, ω, φ, and J/ψ can be generated based on the measured differential cross sections [44,45] that are displayed on in Fig. 6(c). We use the Gounaris/Sakurai parameterization to describe the line shape of the ρ meson mass distribution [46]. The ρ is fixed to the ω with σ ρ /σ ω = 1.21 ± 0.13, which is consistent with the value found in jet fragmentation [36]. Because there is no measurement at forward rapidity, we constrain the η and η using measurements at midrapidity [47][48][49], which is shown in Fig. 6(a), and use pythia v6.428 [10] to extrapolate to forward rapidity.
The p T spectra of ψ and Υ are generated using pythia and normalized using the measurements of ψ to J/ψ ratio [55] and B µµ dN Υ /dy [56], respectively. All mesons are decayed using pythia to handle the decay kinematics.

Open Heavy flavor
The µµ pairs that originate from semi-leptonic decays of heavy flavor hadrons, or heavy flavor pairs, are simulated using two event generators, pythia and powheg.
We use pythia version v6.428 [10]. We use Tune A input parameters as shown in Table VI in Appendix C. In contrast to using the forced cc and bb production modes (MSEL4 or 5), which include only lowest-order process of flavor creation (gg → QQ), we used the mode (MSEL1) which also simulates higher-order processes of flavor excitation (gQ → gQ) and gluon splitting (gg → QQg). Figure 7 shows the Feynman diagrams corresponding to the different production processes. Leading order matrix elements are used for the initial hard process, and next-to-leading order corrections are implemented with a parton-shower approach. A classification of the three classes of processes can be achieved by tagging the event record which contains the full ancestry of any given particle; a detailed account of the characterization of these three classes can be found in Ref. [11].
We also use powheg version v1.0 [12] interfaced with pythia v8.100 [58] to generate heavy flavor muon pairs. We use the default setting for cc and bb productions, including the choices for normalization and factorization scales and heavy quark masses. CTEQ6M is used for parton distribution functions of the proton. In contrast to pythia, NLO corrections are directly implemented in the hard process using next-to-leading order matrix elements. As such, the classification of processes in pythia is not applicable for powheg; there is no trivial connection between the classes of processes in the pythia formalism and the powheg formalism.
The simulated mass spectra of pairs in the ideal muon arm acceptance, which requires that each muon has a momentum p > 3 GeV/c and falls into the pseudorapidity range 1.2 < |η| < 2.2, from cc and bb are shown in Fig. 8. Like-sign pairs from cc is found to be negligible compared to bb in the entire kinematic region and hence neglected for this analysis.
The µ + µ − and µ ± µ ± pair spectra from bb are very similar for both generators; this is consistent with the findings in Refs. [5,7] that, because of the large b-quark mass the spectra are dominated by decay kinematics rather than the correlation between the b andb quarks. For the same reason variations of the scale and PDFs have a small effect on the shape of the mass spectra.
In contrast, we observe a significant model dependence for µ + µ − pairs from cc, indicating a much larger sensitivity to the correlation between the c andc quarks. Similar to e + e − pairs [6], this is most pronounced at low masses. This is due to differences in description of the correlations between the c andc quarks; the opening angle distributions in powheg is flatter which leads to higher yields at low masses. A smaller but non-negligible discrepancy at higher masses is also observed. Because high mass pairs are dominated by back-to-back pairs from leading order processes, this difference is likely due to a harder p T spectrum predicted by powheg compared to pythia.
was determined by investigating the p T distribution of unlike-sign pairs in the mass region 4.8-8.6 GeV/c 2 where the yield is expected to be dominated by Drell-Yan [59]. The procedure and its associated uncertainties will be explained in detail in Sec. VII A 4.

B. Unphysical µµ pair sources
Unphysical pair background is customarily subdivided into combinatorial and correlated pairs. Here the idea is that for combinatorial pairs the two tracks have no common origin and thus are uncorrelated. In contrast, for correlated pairs the tracks do have a common origin, for example they both stem from the decay chain of a heavy hadron or they were part of the fragmentation products of a jet or the like.
In p+p collisions, or generally in events with a small number of produced particles, the distinction between combinatorial and correlated pairs is not well defined. A p+p collision typically produces hard scattered partons accompanied by an underlying event, which consists of initial and final state radiation, beam-beam remnants and multiple parton interactions. The complex event structure in a single p+p event forbids a clear identification of whether two particles stem from a common origin or not. All particles are produced from the two colliding protons, and thus are correlated through momentum and charge conservation. Therefore, the separation is more procedural and is defined by how the relative contributions of correlated and combinatorial pairs are determined. We use an approach that maximizes the number of pairs considered combinatorial, which will be discussed in detail in Sec. VI A 2.
The individual contributions of the unphysical pair background are determined using Monte-Carlo event generators. We treat pairs that are made from two hadronic tracks (hadron-hadron pairs: N hh ) and those with one Comparison of µµ yield in the ideal muon arm acceptance determined using pythia (red solid) and powheg (black dotted). Both are normalized using cross sections(σcc = 312µb, σ bb = 3.86µb) from [7]. The width of the pythia band represents the statistical uncertainty in the calculation.
hadronic track and the other being a muon from the decay of a D, B, or J/ψ meson (muon-hadron pairs: N Dh , N Bh and N Jh ) separately.

Hadron-hadron pairs: N hh
The N hh pairs are simulated with pythia, using parameters listed in Table VI. This Tune A setup reproduces jet-like hadron-hadron correlations at midrapidity in p+p collisions at √ s = 200 GeV [60] reasonably well. To also reproduce the p T spectra we use momentum dependent weighting to match the pythia distributions to data. In the literature there are no data for p T spectra of charged pions and kaons from p+p collisions at √ s =200 GeV in the rapidity region covered by the muon arms. Thus, we interpolate between p T spectra measured at midrapidity [48,[50][51][52] and very forward rapidity (2.9 < y < 3.0) [53]. The data are given in Fig. 6. Weighting factors are extracted for both rapidity ranges as a function of p T , by taking the ratio between data and pythia, where h stands for pion or kaon. For a given p T , we linearly interpolate the weighting factors as a function of y: These weighting factors are shown in Fig. 9. Above p T = 5 GeV/c, where there are no data at forward rapidity, the weights are assumed to be constant. The systematic uncertainties from this weighting procedure are discussed in Sec. VII. The weighting factors are applied to each input particle generated with the pythia simulation. 2. Muon-hadron pairs: N Dh , N Bh , and N Jh Muon-hadron pairs N Dh and N Bh as defined above are constructed using the same pythia and powheg simulations that determine the open heavy flavor pair input. The pion and kaon p T spectra are tuned the same way as discussed above. For the muon-hadron pairs involving decays of the J/ψ (N Jh ) we also match the pythia J/ψ momentum spectrum at forward rapidity to reproduce the measured J/ψ-hadron yield per MB event [44] (see Fig. 6).

Combinatorial pair background
The combinatorial pair background is constructed via an event mixing technique, which combines tracks from different events of similar vertex position z. This is done separately for data and the events used to simulate hadron-hadron pairs, and muon-hadron pairs.
To optimize the description of the pair background spectrum, we maximize the contribution identified as combinatorial pair background, subtract the combinatorial component from the simulation of hadron-hadron and muon-hadron pairs, and substitute the combinatorial pair background with the one determined from data. The motivation of this procedure and the details of the normalization of individual components are discussed in Sec. VI A 2.

V. SIMULATION FRAMEWORK
To directly compare the expected sources to the data, the µµ pairs from the expected sources are propagated through a Monte-Carlo simulation of the PHENIX detector. This simulation is designed to emulate in detail the detector response, and the recording and analysis of data taken with the PHENIX experiment. Histograms of the expected number of µµ pairs are constructed in mass-p T bins, which serve as templates for the subsequent fitting procedure.
The µµ pairs from all physical sources are propagated through the default PHENIX simulation framework. The same approach is not practical for unphysical pair background from π and K decays. Because of the large (∼1/1000) rejection power for these backgrounds, an undesirably large amount of simulations would be necessary to reach sufficient statistical accuracy. Therefore, we use a fast Monte-Carlo (FastMC), developed specifically for this analysis. Detailed descriptions of the two simulation chains can be found in Appendix A.

VI. ITERATIVE PROCEDURE TO EXTRACT CHARM, BOTTOM AND DRELL-YAN CROSS SECTIONS
In the previous two sections we have discussed the different expected sources of µµ pairs and how template distribution of µµ pairs are generated for each. In this section we compare the templates for the expected sources to the experimental data and determine the absolute contribution of each source.
After an initial normalization is chosen for each template, the key sources, cc, bb, Drell-Yan, and the hadronic pair background, are normalized in an iterative template fitting procedure.
A. Initial normalization and data-driven tuning of cocktail

Physical µµ pair sources
The normalization of muon pairs from hadron decays h → µµ(X) is fixed because the cross sections of the parent mesons are set by experimental data as discussed in Sec. IV A 1. The normalizations for each component are varied separate within experimental uncertainties to estimate the corresponding systematic uncertainties (see Sec. VII).
The distributions for muon pairs from cc, bb, and Drell-Yan are normalized by the parameters κ cc , κ bb , and κ DY . These parameters will be determined via the iterative fitting procedure presented in this section. The initial values of κ cc , κ bb , and κ DY are set based on measured data [7].

Correlated hadrons and combinatorial pair background
The composition and normalization of the unphysical pair background sources is key to understanding the µµ continuum and requires a more detailed discussion. In p+p collisions at √ s = 200 GeV, the multiplicity of produced particles is low, and hence there is no clear-cut method to differentiate between a correlated pair and a combinatorial pair. Great care is taken to assure that the procedure used to define combinatorial pairs and how their contribution is normalized does not affect the extraction of physical quantities.
One possibility to circumvent the distinction of correlated and combinatorial pairs is to generate hadronhadron and muon-hadron pairs using a Monte-Carlo event generator like pythia interfaced to the FastMC framework. Templates from a full event normalization include all background pair sources, hence the distinction between them is not necessary. However, in this method the extracted physical cross section is sensitive to how accurate pythia describes the underlying event and how well geant4 treats hadronic interactions in the absorber. This may increase the systematic uncertainties on the extraction of the cc, bb, and Drell-Yan components.
In this analysis we use a data-driven hybrid approach, in which • the maximum possible number of combinatorial pairs is determined from the generated pythia and/or powheg events, • the correlated hadronic pairs are calculated by subtracting the combinatorial pairs determined by mixing generated events, • the combinatorial pairs are replaced by the combinatorial pairs determined from data.
Although the distinction between correlated hadronic pairs and combinatorial pairs depends on the choice of the normalization procedure, using different procedures has a negligible effect on extraction of physical cross sections. The separation of these two components is mostly important for the evaluation of systematic uncertainties, because the correlated hadronic pairs depend on simulations and the combinatorial pairs do not. Replacing the combinatorial pairs from the generator with mixed pairs from data should be regarded as a correction to the simulations to reduce systematic uncertainties.

Normalizing hadron-hadron and muon-hadron pairs
The templates for hadron-hadron pairs N hh (m, p T , z) are generated using pythia simulations interfaced to the FastMC, as discussed above. Templates are determined separately for the three different z regions (z i ) available in the FastMC simulations, z 0 = (−22.5, −17.5 cm), z 1 = (−2.5, +2.5 cm) and z 2 = (+17.5, +22.5 cm), respectively. Only pions, kaons, and their decay products are considered. The momentum spectra were tuned to accurately describe experimental data, where available (see Sec. IV B 1). Therefore, N hh contains the correct mix of individual hadron-hadron pair sources per event. N hh is initially normalized as a per event yield for generated MB p+p collisions.
Similarly, muon-hadron pair templates from cc and bb are constructed using pythia and powheg generators interfaced to the FastMC. The templates N Dh (m, p T , z) and N Bh (m, p T , z) correspond to muon-hadron pairs from cc and bb, respectively. Each is normalized per cc or bb event. Thus, they can be added to N hh scaled by the normalization factors κ cc and κ bb , used for the µµ pairs, such that κ cc N Dh and κ bb N Bh are the expected muon-hadron pair yields per MB p+p event.
For J/ψ, the differential cross section at forward rapidity has been measured [44]. Analogous to the pion and kaon simulations, we weight the simulated J/ψ momentum distribution to match the J/ψ yield at forward rapidity. Because the simulated J/ψ yield is normalized to the measured yield, the muon-hadron pair template N Jh (m, p T , z) represents a yield per MB p+p event.
The full per MB p+p event hadronic pair background can thus be written as: where the templates are functions of m, p T , and z. To minimize any remaining model dependence in N hbg used in the analysis, we determine the combinatorial contribution to N hbg from mixed generated events and replace it with the combinatorial pairs determined from data. For each simulation we determine the combinatorial pairs by mixing either hadron-hadron pairs or muonhadron pairs from different events at the same z i . For a given z i bin the combinatorial pairs are then constructed as: which observes the same relative normalization of the individual components as in Eq. 4. The contributions of each component to the hadronic and the combinatorial pair background, normalized following the above procedure are shown in Fig. 10(b). The normalization of the combinatorial pairs is determined statistically via the ZYAM (Zero Yield At Minimum) technique [61] as described below. We use the azimuthal angle difference ∆φ prim of the like-sign hadronic pairs with masses less than 3 GeV/c 2 . Here ∆φ prim is the difference of the azimuthal angles of the input particles (π, K, D, or B); the distribution is shown in Fig. 11.
First, we remove muon-hadron pairs in which both tracks originated from heavy flavor (cc or bb) pairs, because these pairs can uniquely be identified as correlated. For the remaining pairs we assume that correlations result mostly from jet-fragmentation. These should have a minimal contribution for ∆φ prim ∼ π/2. Thus, our ZYAM assumption is that the correlated yield vanishes at ∆φ prim = π/2. The excess yield for ∆φ prim < π/2 can be interpreted as pairs from the same jet, whereas the excess yield for ∆φ prim > π/2 would correspond to µµ pairs from back-to-back jets. The correlated N corr,sim and combinatorial N comb,sim contributions are now separated via the relations: The separation of N hbg into correlated and uncorrelated components is done for each of the three vertex region z i used in the FastMC simulations. In the data, mixed events are also constructed in 5 cm z-bins, but over the full range from -30 cm to 30 cm. The template distributions are aggregated for three broad vertex ranges, z 0 = (−30, −10 cm), z 1 = (−10, +10 cm) and z 2 = (+10, +30 cm). The normalization of the mixed events from the data is matched to those from the simulation by scaling N hbg (z i ) such that the number of combinatorial pairs of data and simulations are identical in the normalization mass region M (m < 3GeV/c 2 ) for each z bin, i.e., we require: (a) full simulation for hadronic pairs and (b) combinatorial pairs for mass spectra of hadron-hadron and muon-hadron pairs from charm, bottom and J/ψ after initial normalization and tuning.
This rescaling is necessary because we are approximating a ∆z i range of 20 cm from data with a ∆z i range of 5 cm from simulations. For the two z bins further away from the absorber, this approximation holds well even without rescaling because the multiplicity falls linearly with the distance from the absorber, and the center of the bin times the bin width is to first order a good approximation of the integral of the bin. However, for the z bin closest to the absorber, this linear relation no longer holds and a scaling factor of 1.2 is applied to N comb,sim according to Eq. 7.
We then replace the combinatorial background from simulations by data for each vertex region z i : . ZYAM normalization procedure for the south muon arm. The normalization of the uncorrelated pairs from event mixing (red) is determined by enforcing the requirement that the yield of the uncorrelated pairs (Ncorr,sim) is identical to the yield of foreground pairs (N hbg ), excluding the pairs from heavy-flavor decay chains (green) at ∆φprim ∼ π/2. The excess yield is from away-side and near-side jet-like correlations (blue). The periodicity of the distributions arises from the octant structure of the MuTr.
The hadronic pair background in each vertex slice for the south arm, before and after the above replacement of the combinatorial pair background, is shown in Fig. 12. The relative mass-dependent difference between the two estimates of the hadronic pair background ranges from ∼ 0% for the z vtx region closest to the absorber to a maximum of ∼ 20% at m ∼ 4 GeV/c 2 for the z vtx region furthest away from the absorber.
The same normalization is applied to unlike-sign hadronic pairs. Both the unlike-and like-sign hadronic pairs are scaled with a common normalization factor κ h to be determined in the fitting procedure. Finally, we define the correlated hadronic pairs, N cor and combinatorial pairs, N comb via the relations: The distinction between correlated and combinatorial hadronic pairs depends on the details of the normalization procedure. Different normalization procedures can lead to significant differences in the relative contributions

Fit strategy
The absolute contribution of each of the various known sources to the µ + µ − and µ ± µ ± spectra is determined by a fitting procedure using a template distribution for each contribution. There are four fit parameters, κ cc , κ bb , κ DY , and κ h , which are normalization factors for the contributions from cc, bb, Drell-Yan, and the hadronic pairs.
We adopt the following iterative fitting strategy, here parameters marked with a tilde correspond to fit values obtained in the previous step: (i) With a fixed κ cc , fit the like-sign spectrum with κ bb and κ h as free parameters in mass-p T -z vtx slices in the mass range 1-10 GeV/c 2 .
(ii) With the same κ cc as in step (i) andκ bb andκ h obtained in (i), fit mass and p T slices in the unlikesign mass region 4.4-8.5 GeV/c 2 with κ DY as a free parameter.
(iii) Withκ bb andκ h obtained in (i) andκ DY in (ii), fit mass and p T slices in the unlike-sign mass region 1.4-2.5 GeV/c 2 with p T < 2 GeV/c, with κ cc as a free parameter.
This method of fitting exploits the fact that the likesign pairs contain mainly contributions from hadronic pairs and bb; charm only contributes via muon-hadron pairs and is non-dominant while Drell-Yan does not contribute. Thus, the fit results in step (i) is not sensitive to the initial starting value of κ cc . The contribution of hadronic pairs to the µ + µ − and µ ± µ ± pairs increases as the distance between the event vertex z vtx and the absorber becomes larger, due to enhanced probability of pions and kaons to decay before they hit the absorber. In contrast, the yield of µµ pairs from bb is independent of z vtx . To optimize the separating power between µµ pairs from bb and the hadronic pairs, in step (i) we fit likesign pairs in mass-p T -z vtx slices.
Step (i) gives strong constraints on κ bb and κ h , which are to first order free from systematic uncertainties on the cc and Drell-Yan templates. With κ bb and κ h constrained, we move on to step (ii), where we fit the unlike-sign pairs with mass 4.4-8.5 GeV/c 2 . This mass region is chosen to avoid contributions from quarkonia decays. Here, Drell-Yan and bb contributions are expected to dominate while contributions from cc and hadrons are secondary. Although Drell-Yan also contributes to lower masses, the sensitivity to the intrinsic k T make it unfavorable to constrain κ DY in the low mass region. With κ bb , κ h and κ DY constrained, we fit in the mass region 1.4-2.5 GeV/c 2 to constrain κ cc . This mass region is chosen to minimize the contributions of decays from quarkonia and low mass mesons. In this step, we exclude the region with p T > 2 GeV/c from the µ + µ − spectra from the fit, to avoid the uncertainty of the shape of Drell-Yan contribution in this region due to its sensitivity to k T . We then repeat this fitting procedure using the fitted κ cc value obtained in step (iii), and iterate until stable fit results are obtained. Although the fit results in step (i) is not very sensitive to the initial starting value ofκ cc , the iterative procedure ensures consistency and robustness of the final fit results.

Fit function
We use the log-likelihood fit which is applicable to bins having few (or zero) entries. For fitting the µ ± µ ± spectra in step (i), we first divide the data and simulations into mass, p T and z vtx bins. The parameters κ bb and κ h are then varied to minimize the negative log-likelihood defined by: where y i is the number of counts in the i th mass-p T -z vtx bin and C(i; κ bb , κ h ) is the number of expected counts in the i th mass-p T -z vtx bin from all cocktail components. N bb (i) is the number of µµ pairs from bb in the i th bin per generated bb event, N hbg * (i;κ cc , κ bb ) is the sum of the combinatorial and correlated hadronic pairs per MB event, with fixedκ cc . Similarly the log-likelihood function for step (ii) is defined as: where y i is the number of counts in the i th mass-p T bin, C(i; κ DY ) is the number of expected counts in the i th mass-p T bin from all cocktail components. The definitions for N bb (i) is the same as in Eq. 10, while N hbg * (i;κ cc ,κ bb ) is the sum of the combinatorial and correlated hadronic pairs per MB event, with fixedκ cc and fixedκ bb . N cc (i) and N DY (i) are the number of µµ pairs from cc and Drell-Yan pairs in the i th bin per generated cc and Drell-Yan event respectively. N h→µµ(X) (i) is the number of µµ pairs from hadron decays which is constrained from previous measurements. Finally, the log-likelihood function for step (iii) is defined as: where y i is the number of counts in the i th mass-p T bin, C(i; κ cc ) is the number of expected counts in the i th mass-p T bin from all cocktail components. The definitions for N cc (i), N bb (i), N DY (i), N hbg * (i;κ cc ,κ bb ), and N h→µµ(X) (i) are the same as in equations (10) and (11).

Fit results
The three step fitting procedure is iterated until we obtain stable values of κ cc , κ bb , κ DY , and κ h . The fitting procedure is done separately for the two arms. Because the contribution of charm to the like-sign spectrum is very small, the fit converges after two to three iterations. The fit results for the two arms are consistent with each other.
In this section, example fit results using the following simulation configurations are shown: cc and bb generated using powheg, Drell-Yan generated using pythia with intrinsic k T = 1.1 GeV/c. Variations of simulation settings are considered in the evaluation of systematic uncertainties, which will be discussed in Sec. VII. Mass spectra of µ + µ − and µ ± µ ± pairs integrated over p T are shown in Figs. 13 and 14 respectively. Figs. 15 and 16, give a more detailed view of µ + µ − and µ ± µ ± mass spectra in p T slices. The data distributions are well described by the cocktail simulation in both mass and p T except for a small kinematic region at m < 1 GeV/c 2 which is unimportant for the current analysis.

C. Signal extraction
Different cocktail components contribute with different strength to the muon pair continuum in different mass regions for µ + µ − and µ ± µ ± charge combinations. To obtain differential measurements we identify mass regions for the cc, bb, and Drell-Yan signal, where the ratio of the signal to all other µµ pairs is the most favorable for that signal. These regions are referred to in the following as charm, bottom, or Drell-Yan mass region, respectively. The mass regions are:      For each region we extract differential distributions by subtracting all other µµ pair sources.
1. Azimuthal correlations and pair pT of µ + µ − from cc Figure 17 shows the number of pairs per event as a function of their azimuthal opening angle, ∆φ, or their pair transverse momentum p T in the charm mass region. The data are compared to all other sources that contribute in this region. For each ∆φ or p T bin, the number of pairs from charm decays (N +− cc ) is obtained as: where N +− incl is the number of pairs passing all single and pair cuts in Tables I and II  The azimuthal opening angle distribution and pair p T distribution for µ ± µ ± pairs from the bottom mass region is shown in Fig. 18. Besides the bb contribution there are also contributions from correlated and combinatorial hadronic pairs. The number of pairs from bottom decays (N ±± bb ) is obtained according to the following relation: where N ±± incl is the number of pairs passing all single and pair cuts in Tables I and II, N ±± cor is the estimated number of pairs from correlated hadrons, and N ±± comb is the estimated number of combinatorial pairs. We subtract the background as a function of ∆φ or pair p T .
where N +− incl is the number of pairs passing all single and pair cuts in Tables I and II, N +− J/ψ,ψ is the estimated number of pairs from J/ψ and ψ decays, N +− Υ is the estimated number of pairs from the Υ family, N +− cor is the estimated number of pairs from correlated hadrons, and N +− comb is the estimated number of combinatorial pairs. The background contributions as a function of pair mass or p T are shown in Fig. 19.

D. Acceptance and Efficiency Corrections
To obtain a physical yield or cross section Γ, the raw yield Γ raw determined in the previous section, must be corrected for detector effects in multiple steps.
where Γ and Γ raw can represent differential or integrated quantities. The raw yield is converted to yield per event by dividing by N BBC , the number of sampled MB events.
The 23.0 ± 2.2 mb at √ s = 200 GeV [42], it relates to the inelastic p+p cross section σ pp : where BBC = 0.55 ± 0.06 is the fraction of inelastic p+p collisions recorded by the BBC. The BBC trigger bias for hard scattering events is bias = 0.79 ± 0.02 [62]. The other factors in Eq. 16 are rec , the pair reconstruction efficiency that accounts for efficiency losses due to track reconstruction, single track and pair cuts, the software trigger efficiency, and detector inefficiency; A, the detector acceptance; and α, an additional normalization constant that accounts for effects not included in the Monte-Carlo simulation, which will be described in detail in Sec. VII D.
The acceptance A has different meanings for the different measurements presented here. The azimuthal opening angle distributions for µµ pairs from cc and bb are corrected up to the ideal muon arm acceptance, which requires that each muon has a momentum p > 3 GeV/c and falls in the pseudorapidity range 1.2 < |η| < 2.2. For the µµ pairs from Drell-Yan production the correction is for the muon pair to be in the rapidity range 1.2 < |y µµ | < 2.2. To determine the bb cross section we correct up to 4π, the full phase space as shown in Tab. III. In general, A × rec is calculated using the default simulation framework. Input from the appropriate physics event generator is run through the simulation; the ratio of the reconstructed Γ M C raw yield over the input yield Γ M C gives A × rec .
Finally, the factor α accounts for the combined effect of double interactions, α double ; modifications of the reconstruction efficiency due to detector occupancy, α occ ; the change of the trigger livetime with luminosity, α live ; and additional variations with luminosity, α lum ; which are not included in the Monte-Carlo simulations. We determine α by comparing the measured J/ψ cross section [44] with the result using Eq. 16 with α = 1. α = 1.38 for south and north muon arm, respectively. We obtain α = 1.30 ± 0.16 and α = 1.38 ± 0.17 for south and north muon arm, respectively. Our values are consistent with the product of the individual factors α double × α occ × α live × α lum within the systematic uncertainties, where the individual factors are determined with data driven methods (see Sec. VII D).

Azimuthal correlations and pair pT of µµ from cc and bb
The fully corrected per event pair yield is given by Eq. 18.
where X is either ∆φ or pair p T , ∆X is the corresponding bin width, and N HF refers to N +− cc or N ±± bb given by Eq. 13 and Eq. 14, respectively. All other factors are the same as in Eq. 16.
The pair reconstruction efficiency rec (X) is determined using input distributions from pythia and powheg and is computed by taking the ratio of reconstructed and generated yields with both generated tracks satisfying the condition of the ideal muon arm acceptance (p > 3 GeV/c and 1.2 < |η| < 2.2). Here we correct the data up to the ideal muon arm acceptance. We do not correct up to µµ pairs in 1.2 < |y µµ | < 2.2 to avoid systematic effects from model dependent extrapolations. Systematic uncertainties for model dependent efficiency corrections are determined by comparing rec (X) using pythia or powheg as input distributions. This will be discussed in detail in Section VII.

Drell-Yan
The differential cross section as a function of mass or p T is given by Eq. 19 and Eq. 20.
where N DY is raw yield of pairs from Drell-Yan given by Eq. 15. ∆m, ∆p T , and ∆y are the bin widths in pair mass, pair p T and pair rapidity respectively. The factors β(m, y) and β(y, p T ) correct the cross section averaged over the bin to the cross section at the bin center. These correction factors are estimated using pythia simulations and lie between 0.97 and 1.03. All other factors are the same as in Eq. 16. The pair acceptance and efficiency A × rec (m, y) and A × rec (y, p T ) are determined using input distributions generated using pythia. It corrects the pair yield to one unit of rapidity at 1.2 < |y µµ | < 2.2.

Bottom cross section
We also determine the bb cross section from the measured µµ pair yield from bb. In the fitting procedure we determined the normalization κ bb , which was chosen such that it directly relates to the σ bb cross section: The acceptance and efficiency corrections, trigger efficiency, branching ratios, and oscillation parameters are all implicitly encapsulated in κ bb , because the templates for fitting already include all the aforementioned considerations.
We used two models pythia and powheg, to take into account a possible model dependence. The extrapolation from the limited phase space of our µµ measurement to the entire kinematic region can be divided into four steps: • Extrapolation from µ ± µ ± muon pairs with m µµ > 3 GeV/c 2 in the ideal muon arm acceptance to all muon pairs (µ ± µ ± and µ + µ − ) with m µµ > 3 GeV/c 2 in the ideal muon arm acceptance.
• Extrapolation to all muon pairs in the entire mass region (m µµ > 0 GeV/c 2 ) in the ideal muon arm acceptance.
• Extrapolation to muon pairs in 4π. Step by step reduction of phase space for µµ pairs from bb production; starting from all µµ pairs produced to like-sign µµ pairs with mµµ > 3 GeV/c 2 in the ideal muon arm acceptance. All numbers represent the number of µµ pairs per generated pythia or powheg bb event in the specified phase space. Each step is cumulative to the previous, i.e. each row includes one more restriction to the µµ phase space. The factors in brackets quantify the decrease of the number of pairs from the previous step.  Table III quantifies each step. For clarity they are shown in reversed order. One can see that in each step, the difference between pythia and powheg is less than 8%, which is consistent with the observation from Ref. [5], that the model dependence of the extrapolation is small because the µµ (or ee) pair distributions from bottom are dominated by decay kinematics.
The differential cross section dσ bb /dy b | y b =±1.7 can be calculated as follows: 7 is the rapidity density of b quarks determined from the average of pythia and powheg, κ bb, N S is the fitted normalization for bottom from the north (south) muon arm.

VII. SYSTEMATIC UNCERTAINTIES
We consider four types of sources of possible systematic uncertainties on the extraction of µµ pairs from cc, bb, and Drell-Yan. These are uncertainties: • on the shape of the template distributions, • on the normalization of template distributions, • on the acceptance and efficiency corrections, • and on the overall global normalization.
The first three sources of systematic uncertainties are point-to-point correlated, but allow for a gradual overall change in the shape of the distributions. We refer to these uncertainties as type B. Global normalization uncertainties do not affect the shape of the distributions but only the absolute normalization; these are quoted separately as type C.
There are multiple contributors to each type of systematic error, for example the cc and bb templates are model dependent and can be determined with pythia or powheg. For each such case we repeat the full analysis with the various assumptions. The spread of the results around the default analysis is used to assign systematic uncertainties.
If we considered two assumptions, like in the example given, we quote the uncertainty as half the difference between the two assumptions. If there is a clearly preferred default case, we use the difference of results obtained with extreme assumptions to assign systematic uncertainties.
We quantify all systematic uncertainties as standard deviations. The systematic uncertainties on the different measurements are summarized in Table. IV. For the differential distributions of cc, bb, and Drell-Yan, the systematic uncertainties vary with azimuthal opening angle, pair p T or mass as shown in Fig. 20.

A. Shape of simulated distributions
The cc, bb, Drell-Yan, and hadronic pair background components are correlated through the fitting procedure, thus an uncertainty on the shape for any one template distribution will affect the fit results of all four components simultaneously. For example, if one increases the hardness of the input pion p T spectrum, the number of high mass like-sign hadron-hadron pairs will increase, which will lead to a smaller µ ± µ ± pair yield from bb. Because bb is the main competing source to the Drell-Yan process in the high µ + µ − pair mass region, this will in turn lead to a larger Drell-Yan yield. Drell-Yan and bottom both contributes to the intermediate mass region where cc is extracted, and hence will also modify the cc yield.
In the following we will discuss the uncertainties on the shape of individual contributions and how these uncertainties propagate to the measurement of all components.

Input hadron spectra
The input pion and kaon p T spectra are tuned to match PHENIX and BRAHMS data at y = 0 and y = 2.95, respectively. This is achieved by applying weighting factors (w h (y)) to the p T spectra from pythia, which are determined by a linear interpolation between the two ratios of pythia to the data at y = 0 and y = 2.95 (see Fig. 9). To estimate the systematic uncertainties on the input hadron p T spectra, we vary the weighting function. We use either w h ( y = 0) for all light hadrons, which gives a harder p T spectra than the default case, or w h ( y = 2.95), which gives a softer p T spectra. The shape of the hadron-hadron pair mass distribution changes significantly only for masses above 3 GeV/c 2 .
We take the difference of the cross sections obtained using these two sets of p T spectra and the default p T spectra as a systematic uncertainty on the input hadron spectra. For σ bb , this is determined to be +4.7% and −11.0%. The uncertainties are also propagated to the bb and cc azimuthal opening angle distributions and the Drell-Yan yields. In all cases this is a dominant contributor to the systematic uncertainties (see Table IV).
We have also considered using the bands shown in Fig. 9 as limits for the weighting factors, which lead to smaller uncertainties and we choose to quote the more conservative estimate. Uncertainties related to the choice of parton distribution function (PDF) are estimated by evaluating the differences obtained with simulations using the CTEQ5, CTEQ6, MRST2001(NLO) [63] and GRV98(LO) [64] parton distribution functions. The differences are negligible compared to the uncertainty due to shapes of the light hadron p T spectra.

Hadron simulation
The default PHENIX geant4 simulation utilizes the standard HEP physics list QGSP-BERT. For hadronic interactions of pions, kaons and nuclei above 12 GeV, the quark gluon string model (QGS) is applied for the primary string formation and fragmentation. At lower energies, the Bertini cascade model (BERT) is used, which generates the final state from an intranuclear cascade.
To estimate possible uncertainty due to the description of the hadronic interactions in the absorbers, we have used two other physics lists: The (i) FTFP-BERT list, which replaces QGS with the Fritiof model (FTF) for high energies. The FTF uses an alternative string formation model followed by the Lund fragmentation model. And (ii) QGSP-BIC where the low energy approach is replaced by the binary cascade model (BIC), which was optimized to describe proton and neutron interactions, but is less accurate for pions.
Using these different physics lists leads to a 2% difference of σ bb , and a negligible difference to the charm and Drell-Yan normalizations.

Charm and bottom simulation
There are potential model dependencies of the µµ and muon-hadron templates for cc and bb. To estimate these we compare the µµ and muon-hadron templates obtained using pythia and powheg. Systematic uncertainties on IV. Summary of arm-averaged relative systematic uncertainties for the total bottom cross section σ bb , the differential Drell-Yan cross section d 2 σDY →µµ/dmdy, and the bb (cc) differential yields dN bb(cc)→µµ /d∆φ. The systematic uncertainty type is indicated in the second column and is applicable only to the differential measurements. The uncertainties for the differential measurements vary with azimuthal opening angle, pair pT , or mass. Asymmetric uncertainties are quoted in bracketed values. For the cc measurement, the regions ∆φ < π/2, pT < 0.5 GeV/c and pT > 2.0 GeV/c are excluded because the yield approaches zero and relative systematic uncertainties diverge. With these regions excluded, the difference between the systematic uncertainties of all measurements for the south and north muon arms differs by no greater than 2% for all systematic uncertainties sources. charm and bottom are assumed to be uncorrelated and are added in quadrature.
Due to the large mass of the bottom quark, decay kinematics govern the shape of the distributions, hence the difference between pythia and powheg is small (see Fig. 8). The largest effect of this uncertainty is exhibited at mass ∼ 5 GeV/c 2 for the Drell-Yan measurement where the contribution of bb is around 40% of the total yield.
For charm, the model dependence is larger than that of bottom, particularly for m < 1 GeV/c 2 . In the high mass region powheg tends to predict higher yields for both µµ and muon-hadron templates, which is likely due to a harder single muon p T spectrum. However, this has a small effect on the extraction of bottom and Drell-Yan yields in the high mass region where the contribution of charm is less than 10%.

Drell-Yan
The intrinsic k T = 1.1 GeV/c used in the pythia simulations is determined by minimizing χ 2 of the p T distribution of Drell-Yan pairs in the Drell-Yan mass region, between data and simulations. Background components (mostly from cc and bb) are normalized using cross sections obtained from the procedure and subtracted as a function of p T . We find that an intrinsic k T of 1.1 GeV/c best describes the p T distribution of Drell-Yan pairs in the high mass region (see Fig. 21).
We vary the k T by ±0.1 GeV/c where the χ 2 changes by ∼ 1 to estimate uncertainties in the Drell-Yan distributions. The uncertainty mainly affects the cc yield at ∆φ < π/2 and p T > 2 GeV/c and is negligible elsewhere.

ZYAM normalization
To estimate the effect of varying the relative contributions between correlated and uncorrelated pairs, we have varied the mass region which we use for the ∆φ prim distribution. Instead of the default normalization region M below 3 GeV/c 2 , we picked 3 separate regions: 0.7-1.3 GeV/c 2 , 1.3-1.6 GeV/c 2 , 1.6-2.2 GeV/c 2 . This results in a variation of the ratio of correlated to uncorrelated pairs by ±10%. The relative effect on the sum of correlated and uncorrelated pairs is less than 2% over the entire mass region, and has a negligible effect on the determination of bb, cc, and Drell-Yan cross sections.

6.
Hadron-hadron correlations from pythia For the measurement of cc yields as a function of ∆φ or pair p T , correlated hadron pairs are a major background source. To estimate the uncertainty in the description of Tune A pythia, we compare distributions of like-sign pairs between data and simulation in the same mass region (1.5-2.5 GeV/c 2 ) where other contributions, including bb are negligible. We observe that the width of the back-to-back peak at ∆φ = π is slightly wider in data compared to pythia simulation. This is seen in the p T distributions as well, because p T is strongly correlated with ∆φ. The discrepancy is strictly less than 12% and varies with ∆φ or p T . One data driven approach would be to apply an additional weight to the unlike-sign hadronic pair background as a function of ∆φ or p T , where the weight is computed by taking the ratio between data and simulations using the like-sign pairs as a function of ∆φ or p T in the same mass region. This is motivated by the fact that the like-sign pairs are dominated by hadronic contributions in the mass region of interest.
Here we take the average between the Tune A setup and this data driven modification to be our central value, and assign a systematic uncertainty on the cc yields as the difference between these two approaches. The resultant systematic uncertainty is strongly ∆φ and p T dependent, ranging from 0% to 14%.

Azimuthal angle(φ) description in simulations
We compare the φ distributions of single tracks in data, simulations with the default framework, and the FastMC. We find reasonable agreement between data and the default simulation and conclude that the uncertainty from the default simulation framework is negligible. However, for simulations using the FastMC, we approximated the relative φ dependent efficiency by a weighting strategy in φ bins of finite width, which gives rise to a small smearing in the φ (and hence ∆φ and to a lesser extent p T ) distributions (see Fig. 36). We assign 5% uncertainty to the ∆φ distributions generated using the FastMC, which is estimated by comparing ∆φ distributions of mixed pairs between FastMC and real data. This in turn gives rise to an average of 5% and 3% to the cc and bb differential yields respectively.

8.
z-vertex description of simulations We have generated hadronic pairs in discrete z vtx regions that cover 1/4 of the full collision z vtx region using the FastMC. Figure 22 shows a comparison of data and simulations in different z vtx regions after the initial normalization (Sec. VI A 2) and iterative fitting procedure (Sec. VI B). We see good agreement between the simulations and data in all z vtx regions; there is no indication that the approximations in the z vtx description of correlated hadrons is biasing the fit of the like-sign pairs.
To estimate the systematic uncertainty on this approximation, recall that the yield of decay muons varies linearly with z vtx , whereas the yield of prompt muons is constant [16]. Thus, the main effect of the z vtx approximation is the uncertainty on the prompt muon to decay muon ratio. In the FastMC the ratio is determined in three vertex bins of 5 cm width at z vtx = -20, 0, and 20 cm, instead of the full 20 cm z vtx slices. We assign a systematic uncertainty by varying the prompt muon to decay muon ratio separately for each z vtx region. Because prompt muons are dominated by charm decays, we estimate this effect by varying the charm cross section by ±15% for one particular z slice separately. The effect on the fitted bb cross section is ∼ 1% and is negligible compared to other sources of systematic uncertainties.

B. Normalization of simulated distributions
In addition to uncertainties due to the shape of distributions, uncertainties on the normalization of one component can affect the yield of other components. We list sources of such uncertainties in this section.

Fitting
To estimate uncertainties in the fit range, we vary the lower bound of the fit range of like-sign pairs from m = 1.0 GeV/c 2 to m = 2.0 GeV/c 2 . The variation in σ bb is around 2% and is assigned as the systematic uncertainty on the fit range. The unlike-sign fit range is also varied to diagnose possible effects due to non Gaussian tails of the mass distribution of µ + µ − pairs from resonance decays. The variation of κ cc is less than 5% with different fit ranges in the unlike-sign, and this κ cc variation propagates into < 1% variation in σ bb .
We estimate possible uncertainties due to the stability of the fit by varying the binning of distributions. The variations are negligible compared to the statistical uncertainty. We therefore do not assign systematic uncertainties on fit stability.

Normalization of cocktail components
The vector mesons φ, ω, ρ, J/ψ, ψ , and Υ are background components to determine N +− cc and N +− DY in Eq. 13 and 15, respectively. Their normalizations are fixed using previous measurements. The normalization of each component has associated statistical and systematic uncertainties from those measurements. We add these uncertainties in quadrature and vary normalizations of these background components to estimate propagated uncertainties in N +− cc and N +− DY . Because the template fit excludes all mass regions dominated by resonance decays, the uncertainty from the normalizations of the resonances only have a minor effect of less than 2% on the fit results, which is negligible compared to other sources of uncertainties.

Statistical uncertainty in fit result
Charm, bottom, and hadronic pairs are background components for N +− DY . The statistical uncertainties on fitted values of κ cc , κ bb , and κ h become a source of systematic uncertainty for N +− DY . Similarly, systematic uncertainties for N +− cc arise from statistical uncertainties on κ h , κ DY , and κ bb , and N ±± bb from κ h and κ cc . The statistical uncertainties for κ bb and κ DY is ∼ 8%, and for κ h is ∼ 2% for each arm. The associated systematic uncer-tainty depends heavily on the signal to background ratio and varies from measurement to measurement.

C. Extrapolation, acceptance and efficiency
This section details systematic uncertainties related to acceptance and efficiency.

Model dependence on bb
We use the high mass like-sign pairs to constrain σ bb , hence a determination of dσ bb /dy involves an extrapolation to zero mass at forward rapidity, whereas the determination of σ bb involves a further extrapolation to the full rapidity region. This is dependent on correlations between µµ pairs from bottom as well as the oscillation parameters and branching ratios. To quantify the uncertainty in the extrapolation, we take the average of the fitted cross section σ bb using pythia and powheg and assign the difference (±6.5%) as the systematic uncertainty. We note that the difference between the default values of the time-integrated probability for a neutral B 0 d (B 0 s ) to oscillate χ d (χ s ) of pythia and the values from the PDG, χ d = 0.1860 ± 0.0011 (χ s = 0.499304 ± 0.000005) [36] is less than 2% and hence much less than the assigned uncertainty.

2.
Model dependence on efficiency correction The charm and bottom azimuthal opening angle distributions are corrected to represent µµ pairs the ideal muon arm acceptance. To assess the sensitivity to different input distributions we compare the efficiency as a function of ∆φ calculated using pythia and powheg. No model dependence of the efficiency corrections is observed for µµ pairs with ∆φ > 1.5 from cc and bb. For ∆φ < 1.5, we assign an additional uncertainty based on the difference of the efficiency corrections calculated by pythia and powheg.
The charm and bottom pair p T spectra are also corrected to represent the muon arm acceptance. No model dependence of the efficiency corrections is observed for µµ pairs in the measured p T range. We assign an uncertainty based on the statistical uncertainty of the calculated efficiency corrections.
For Drell-Yan, we estimate the model dependence of the acceptance and efficiency corrections by varying the intrinsic k T settings of pythia within the systematic limits as described in Sec. VII A 4. No model dependence of the acceptance and efficiency corrections is observed. We assign an uncertainty based on the statistical uncertainty of the calculated efficiency corrections.

Trigger efficiency
The possible discrepancy between the software trigger emulator and the hardware trigger is quantified by comparing the real data trigger decision with the offline software trigger. We find that they differ by within 1.0% and 1.5% for the south and north arm, respectively. We use these values as estimates of the associated systematic uncertainty.

Reconstruction efficiency
The muon track reconstruction and muon identification used in this analysis is the standard PHENIX muon reconstruction chain. The systematic uncertainties have been previously studied. We assign MuTr (4%) and MuID (2%) as systematic uncertainties on reconstruction efficiency based on the work published in [16].

D. Global normalization uncertainties
The absolute normalization of the µµ pair spectra is set by the measured J/ψ yield [44], which is measured with an accuracy of 12%. This is the systematic uncertainty on the scale for all results presented in this paper.
The normalization is expressed in Eq. 16 by the factor α, which accounts for the combined effect of the change of the trigger livetime with luminosity α live , modifications of the reconstruction efficiency due to detector occupancy α occ , additional variations of the efficiencies with luminosity α lum , and the effect of double interactions α double .
As a cross-check, these individual factors were determined separately. The trigger livetime was monitored during data taking and the correction was found to be 1.35 (1.30) for the south(north) arm, respectively. The occupancy effect was studied by embedding simulated µµ pairs in p+p events and results in α occ = 1.06 (1.04). In addition, there is a drop of the detector efficiency with increasing beam intensity that was found to give α lum = 1.04 (1.07).
Finally, the approximately 20% double interactions in the sample increase the pair yield by about 11%, resulting in α double = 0.90. The yield increase is smaller than the number of double interactions mostly for two reasons. Diffractive events contribute to events with double interactions but do not contribute significantly to the pair yield. Events with double interactions contain collisions more than 40-50 cm away from the nominal collision point; pairs from these events have significantly reduced reconstruction efficiency. The combination of both effects approximately cancel the efficiency losses due to detector occupancy and high interaction rates.
The product of individual corrections to the normalization is α double × α occ × α live × α lum = 1.33 (1.34) for the south (north) arm. These values are consistent within uncertainties with 1.30 ± 0.16 (1.38 ± 0.17), the values based on the J/ψ measurement.

VIII. RESULTS
A. Azimuthal opening angle and pair pT distributions for µµ pairs from cc and bb The fully corrected µµ pair yield from cc and bb decays are shown in Figs. 23 and 24 as a function of ∆φ and pair p T . The muons are in the nominal acceptance of p > 3 GeV/c and 1.2 < |η| < 2.2. The pairs are in selected mass ranges of 1.5 < m µ + µ − < 2.5 GeV/c 2 and 3.5 < m µ ± µ ± < 10.0 GeV/c 2 for cc and bb, respectively. The yields for the two pseudorapidity regions are consistent with each other. Due to the mass selection, the ∆φ and p T distributions are highly correlated with each other.
The spectra for the two pseudorapidity regions are combined using the method documented in Appendix B and compared to model calculations based on pythia and powheg. The comparison is shown in Figs. 25 and 26. Pairs generated by the models are filtered with the same kinematic cuts that are applied in the data analysis. The model curves are normalized using the fitting procedure outlined in Sec. VI B.
For cc the model calculations are normalized in the kinematic region 1.4 < m < 2.5 GeV/c 2 and p T < 2 GeV/c to the data. Consequently, as seen in Fig. 26, the p T spectrum is adequately described by both pythia and powheg for p T < 2 GeV/c. However, for p T > 2 GeV/c, the yield predicted by powheg is systematically higher than the data, while the yield from pythia is more consistent with the data.
The larger yield predicted by powheg also manifests itself in the ∆φ projection at ∆φ < 1.5. For cc, the azimuthal correlation determined with powheg is significantly wider compared to the one from pythia. Again the data favor pythia in the probed kinematic region. This is particularly apparent at ∆φ < π/2.
Because both pythia and powheg use the pythia fragmentation scheme and very similar parton distribution functions, the differences between the model calculations must result from the underlying correlation between the c andc quarks that originate from the pQCD differential-cross-section calculation. Our data are more consistent with pythia than with powheg. We note that this preference is not limited to data taken in the kinematic region accessible in this analysis; it also holds true for the mid-forward kinematic region probed by the PHENIX electron-muon measurement [35] and mid-mid kinematic region probed by the PHENIX dielectron measurement [7].
For bb, pythia shows a slightly wider peak in ∆φ than powheg. However, within uncertainties the data are well described by both generators in ∆φ and p T . The smaller model dependence can be traced back to the larger b quark mass, which is much larger than the muon mass [7]. For the bulk of B meson decays, the momentum of the muon is nearly uncorrelated to the momentum of the decay muon. Therefore, the opening angle between two muons from bb is randomized. In other words, the distributions of µµ pairs from bb are mostly determined by the decay kinematics and are less sensitive to the correlation between the b andb quark.
For the pythia calculation we can distinguish heavy flavor production from different processes, specifically pair creation, flavor excitation, and gluon splitting. To separate these we access the ancestry information using the pythia event record. Despite the fact that the measured azimuthal opening angle and pair p T distributions are constrained due to the limited acceptance and the mass selection, there are clear differences between the shapes generated by different processes. The leading order pair creation features a strong back-to-back peak, whereas next-to-leading-order processes exhibit much broader distributions. For bb, pythia predicts negligible contribution from gluon splitting, whereas for cc, there is significant contribution from gluon splitting, particularly for ∆φ < 1 and p T > 3 GeV/c. For both cc and bb, the default ratios and shapes of the three different processes from pythia describe the data well.
Although for powheg a similar separation is not possible, it seems as if contributions from higher order processes with characteristics similar to gluon splitting are more frequent in powheg than in pythia, leading to a broader azimuthal opening angle distribution and a harder p T spectrum for pairs from cc. More constraints on the cc correlations, which seem to drive the observed model differences, could be obtained from a quantitative and systematic study of heavy flavor correlations for p+p collisions at √ s = 200 GeV obtained from different kinematic regions. A simultaneous analysis of the ee [7], eµ [35] and µµ data can provide stronger discriminating power to different theoretical models. Such an analysis is presented in [65].

IX. BOTTOM CROSS SECTION
To determine heavy flavor production cross sections, the µµ pair data need to be extrapolated from the small kinematic region covered by the experiment to the full phase space. This extrapolation has to rely on model calculations. For the case of charm, there are significant discrepancies between the differential distributions calculated by different models, hence an extrapolation to full phase space is model dependent [7]. However, this is less of an issue for bottom production. The distributions of µµ pairs from bb are dominated by decay kinematics and model dependent systematic uncertainties on the extrapolation are much less dominant. In the following we determine the average of the bottom cross sections obtained from pythia and powheg using the fitting procedure, and assign systematic uncertainties according to the difference between models. The extracted cross sections using pythia and powheg are listed in Table. V. The first two columns display the cross sections obtained by fitting data from the south and north muon arm at backward and forward rapidity, respectively. These values are then converted rapidity dσ bb /dy at y = −1.7 and y = +1.7, correspond-ing to the average rapidity of the south and north muon arms.

|[rad]
The results are shown in Fig. 27 and compared to other PHENIX bottom-cross-section measurements via various channels (B → J/ψ [23], dielectrons [7], e-h correlations [24]), and differential cross sections computed  using fixed-order-plus-next-to-leading-log (FONLL) [9], mcnlo [13] and powheg [12]. In all three calculations, we adopted the "standard" value of m b = 4.75 GeV/c 2 [14]. This choice of the bottom quark mass is mainly motivated by the mass of Υ(1S). It has been shown in previous studies that the NLO pQCD calculations with this standard value of m b can reproduce the p+A and π+p bottom cross sections at low energies fairly well to within large experimental and theoretical uncertainties [66]. The large theoretical uncertainties arise from the renormalization and factorization scale, bottom quark mass and PDF choices. We observe that the model dependence on the differential bottom cross section as a function of rapidity is small (< 10%); it is mainly due to the uncertainties in the PDFs. The shaded band correspond to theoretical uncertainties estimated using a FONLL calculation, which includes uncertainties on the renormalization and factorization scales, bottom quark mass (varied between 4.5 and 5.0 GeV/c 2 ), and PDFs, added in quadrature. The measurements at √ s =200 GeV tend to prefer the upper limit of this uncertainty band.
The measurements using the two muon arms can be combined to give a more precise measurement of the total bottom cross section, σ bb [µb] = 3.75 ± 0.24(stat) ± 0.35

0.50
(syst) ± 0.45(global), which is the most precise measurement of the bottom cross section at √ s = 200 GeV to date. In Fig. 28, our measurement is compared to all other RHIC measurements.
As can be seen from Figs. 27 and 28, all RHIC bottomcross-section measurements are remarkably consistent with each other. We compare to the total cross sections from various next-to-leading order (NLO) or nextto-leading logarithmic (NLL) calculations, including the NLO calculation from Ref. [14], again using the value m b = 4.75 GeV/c 2 for the bottom quark mass. The total bottom cross section is around a factor of two higher than all theoretical calculations with m b = 4.75 GeV/c 2 .
These measurements can be compared to the global trend of the bb cross section as a function of √ s [27-29, 31, 67-72], as shown in Fig. 29. Interestingly, the variation of different theoretical calculations is less than 8% despite spanning 5 orders of magnitude in cross section and 3 orders of magnitude in beam energy. At beam energies larger than 2 TeV, the data points from the Tevatron and LHC are in good agreement with the central values of the theoretical calculations, in contrast to measurements at √ s = 200 GeV at RHIC. Following the unconstrained averaging procedure adopted by the PDG [36], the weighted average of the σ bb measurements at RHIC is 3.8 ± 0.5 µb, and is > 3σ higher than the the- is required for powheg to reproduce the bottom cross section measured at √ s = 200 GeV. This mass is significantly lower than the pole mass of the bottom quark, 4.78 GeV/c 2 [36], hence it is unlikely that this discrepancy can be explained solely by the uncertainty in the bottom quark mass.
This measurement indicates that an effect which is more visible at lower beam energies may still be missing in current theoretical calculations. Future measurements at beam energies between ∼ 10 GeV and ∼ 1000 GeV with higher precision should help shed light on this issue.

A. Drell-Yan differential cross section
The fully corrected µµ pair cross section from the Drell-Yan process in the pair rapidity region 1.2 < |y µµ | < 2.2, as a function of mass, and a function of p T for pairs in the mass region 4.8 < m [GeV/c 2 ]< 8.2 are shown in Figs. 30 and 31, respectively. The kinematic region covered by the measurement corresponds to a Bjorken-x value of ≈ 5 × 10 −3 . The measured differential Drell-Yan cross section at forward and backward rapidities are consistent with each other.
We combine the measurements from the two rapidity regions. The mass spectrum is then compared with NLO calculations from Vitev [73] and Qiu J. et al [74] in Fig. 32. Both calculations adopt the factorization approach where higher orders are evaluated order-by-order in perturbation theory. Within experimental uncertainties, the data are well reproduced by NLO calculations. The p T spectrum of Drell-Yan muon pairs in the mass region 4.8-8.2 GeV/c 2 is shown in Fig. 33 and compared The corrected µµ yield from Drell-Yan in pair rapidity region 1.2 < |y µµ | < 2.2 as a function or pair mass. Results are shown separately for the south and north muon arms.
to pythia, where the intrinsic k T is tuned from the procedure described in VII A 4, and normalized from the fitting procedure as documented in the above text. We find that an intrinsic k T of 1.1 GeV/c and a k-factor of 1.23 best describe the data. To date this is the first Drell-Yan measurement at RHIC energies. As Drell-Yan is a common background to various physics processes involving dileptons, the presented data may give a constraint for the background estimation of such measurements. The Drell-Yan cross section as a function of invariant mass and p T can also provide constraints on the unpolarized transverse-momentum-dependent parton distribution functions (TMD PDFs), which is of critical importance to understanding the internal structure of the proton. This measurement gives input to a previously unexplored phase space and serves as a solid baseline for future measurements.

X. SUMMARY
We present µµ pair measurements from open heavy flavor decays and the Drell-Yan mechanism in p+p collisions at √ s = 200 GeV. Invariant yields of µµ pairs from cc and bb are measured as a function of ∆φ and p T and compared to different models, pythia and powheg. Within experimental uncertainties, the azimuthal opening angle and pair p T distributions from bb are well described by these models. For cc, the data favor the pythia description, while the powheg calculations predict a systematically The corrected µµ yield from Drell-Yan in pair rapidity region 1.2 < |y µµ | < 2.2 and mass region 4.8 < m < 8.2 GeV/c 2 as a function of pair pT . Results are shown separately for the south and north muon arms.
higher yield than pythia at smaller opening angles in the probed kinematic region.
We find that the high mass like-sign pairs are dominated by decays from open bottom, which provides a strong constraint to the bottom cross section. The measured total bottom cross section is consistent with RHIC measurements at the same energy, and is around a factor of two higher than the central value of NLL and NLO calculations with an input bottom quark mass of m b = 4.75 GeV/c 2 .
The Drell-Yan cross section as a function of mass in 4.8-15.0 GeV/c 2 is presented and compared to NLO calculations from Vitev and Qiu. Within uncertainties we find good agreement between NLO calculations and data. The Drell-Yan p T cross section in the mass region 4.8-8.2 GeV/c 2 is also presented, along with the pythia tune that best describes the data. ence  framework takes into account the detector's geometrical acceptance and all inefficiencies from dead channels.
To account for variations of detector performance during the data taking period, the data are split into run groups with similar performance. For each group a map of dead channels is created for the MuTr. The simulation randomly selects these maps according to the sampled luminosity for each run group. Comparison of distributions from FastMC and default PHENIX simulation framework. (a) mass spectrum of J/ψ muon pairs; (b) single pT spectrum of muons from π ± and K ± with realistic input pT spectra; (c) pair pT spectrum of muon pairs from bb; (d) ∆φ of muon pairs from bb.
Muon pairs from physical sources are simulated with a z-vertex distribution taken from MB p+p data. Once the pairs are processed through the detector simulation, they are reconstructed using the same procedure and filtered with the same cuts as used for real data. Thus, all detector effects including acceptance, dead areas, track reconstruction, and analysis cuts are taken properly into account.
Because the analyzed data are triggered with the MuIDLL1-2D trigger, the effects of the trigger also need to be accounted for. To achieve this, we apply an offline software trigger to all simulated tracks, which is an exact replication of the online hardware MuIDLL1-1D trigger. We require that both tracks of a pair fulfill the MUIDLL1-1D trigger condition. Here we make use of the fact that after enforcing a spatial separation of 20 cm between two MuID tracks, the MuIDLL1-2D pair trigger is reduced to a logical AND of the MuIDLL1-1D single track triggers. The separation cut necessary to achieve this factorization was determined from experimental data. In Fig. 35(a,c) a ∼20-30% difference between the mass distribution from data triggered with the MuIDLL1-2D and the data requiring each track fulfills the MuIDLL1-1D is visible at low masses. Once the sep-aration cut is applied the difference disappears, as seen in panels (b) and (d).

FastMC
In spite of the large hadron rejection power (∼ 1/1000) of the muon arms, a significant fraction of the reconstructed muons are from decays of light-flavor mesons (π ± , K ± , and K 0 ). Using the default Monte-Carlo to simulate these pairs is unpractical, because for every 1,000,000 generated pairs of particles in the detector acceptance, only one muon pair would be reconstructed from the simulation. In the FastMC approach we separate the generation of particles that result in reconstructed µµ pairs from the simulation of the detailed detector response to an individual particle. The FastMC proceeds in four steps: (i) generation of a repository of possible detector responses to an individual particle using the default simulation framework, (ii) creation of events with multiple muons from the sources discussed in Sec. IV, here the repository created in step (i) is used to determine the detector response, (iii) weighting each reconstructed muon with the appropriate probability for being reconstructed and not rejected by the analysis cuts, and (iv) finally forming muon pairs and calculating their mass, p T and azimuthal opening angle.
a. Detector response to individual particles For each particle species (π ± , K ± , K 0 , and µ ± ) ∼ 10 9 particles were simulated. All particles are propagated through the full geant4 simulation and reconstruction chain. Light hadrons (π ± , K ± , and K 0 ) may give rise to a reconstructed muon either via (i) decaying to a muon in flight (decay muons), or (ii) penetrating all absorber layers (punch-through hadrons). The contribution from protons is negligible (< 1%) compared to kaons and pions and hence neglected in this study.
These parent particles are generated with flat distribution in momentum p and polar angle θ, and uniform distribution in φ. Simulations are performed in three uniform z-vertex regions, (−22.5, −17.5 cm), (−2.5, 2.5 cm), and (17.5, 22.5 cm), to account for variances in detector response along z vtx . Improvements by expanding to full collision z vtx coverage in simulations is expected to be minimal(see Sec. VII A 8). All reconstructed variables are stored along with the generated vertex and parent momentum information. These muon candidates are grouped into pools according to parent particle species and parent p and θ, where p and θ ranges from 2 to 32 GeV/c and 0 to 0.8 radians respectively, which covers the kinematic region relevant for this analysis. One single pool covers the kinematic region ∆p×∆θ = 0.1×0.02 [GeV/c rad]. The minimum number of muon candidates in one pool is ∼ 10. These pools are used as repository for the possible detector response to parent particles in the subsequent steps of the FastMC.

b. Events with reconstructed muons
To create an event with reconstructed muons, we first generate events of particles as discussed in Sec. IV B. For each event the list of particles is filtered so that only π ± , K ± , K 0 , and µ ± in the vicinity of the muon arm acceptance are kept, and the momentum information of these particles is stored. We will refer to these particles as input particles.
A given input particle is matched to a pool of muon candidates in the repository for that particle species, and the input particle's p and θ. We randomly choose one muon candidate from the pool and use the reconstructed variables from that muon candidate for the input particle. The repository pools were generated from parent particles with a uniform φ distribution. While the input particles are matched to the muon candidate in parent p and θ, they are not matched in φ. We therefore rotate all reconstructed variables in the azimuthal plane from the φ of the parent of the muon candidate to the φ of input particle.
At this point we have created a reconstructed muon with all the characteristics that could have resulted from propagating the input particle through the default simulation framework. In particular, because the matching of input particles to muon candidates is completely random, the relative contributions and momentum distribution of decay muons and punch-through hadrons are properly accounted for. This procedure is repeated for all input particles in an event.
c. Weighting each reconstructed muon with its probability So far each input particle leads to a reconstructed muon. This does not take into account the hadron rejection of the muon arms and the reconstruction efficiencies. Rejection and efficiency are encapsulated in weighting factors that are applied to each reconstructed muon. We factorize the weight into two components weight reco and weight φ , which are discussed in the following. The final weight is calculated as: weight = weight reco × weight φ . (A1) Weighting in p and θ The survival probability of a decay muon is highly dependent on the momentum of the muon, as well as the amount of material it traverses in the absorber, which in turn is dependent on the input particle's momentum p and the polar angle θ. We associate a weighting factor weight reco (p, θ, z) to each muon candidate. This factor is the probability that an input particle with momentum p and polar angle θ, produced at vertex z, results in the reconstructed muon candidate, averaged over φ. The weight is computed by dividing the number of reconstructed muons in each pool by the number of parent particles generated to create the corresponding pool.

Weighting in φ
In addition to weight reco (p, θ, z), we also need to weight in φ direction, weight φ , to account for the φ dependent relative survival probability and reconstruction efficiency. These mainly depend on the geometry of the MuTr, thus the weighting factors are determined by a combination of variables (φ MuTr , p MuTr T , p MuTr z ), which are the azimuthal position, transverse momentum, and longitudinal momentum evaluated at MuTr Station 1. To determine weight φ , we generate single muons with a realistic momentum distribution and propagate these muons through the default simulation framework. Because the overall survival probability is factored into weight reco , weight φ is normalized by requiring the average value of weight φ to be one, i.e. . (A2)

d. Constructing muon pairs
In each event all reconstructed muons are combined to pairs. The pair variables are constructed from the reconstructed muon information following the exact same procedure as in real data. The weighting factor for a muon pair is the product of the weighting factors of the two reconstructed muons: This assumes that the pair reconstruction efficiency is a product of single track reconstruction efficiencies, which is true for tracks that are spatially separated in the MuTr and MuID. The latter is assured by the pair cuts we apply.
To estimate the accuracy of the FastMC, which is used to simulate muon-hadron and hadron-hadron pairs, we propagate µµ pairs and single hadrons through the default simulation framework and FastMC and compared the resulting distributions. We find that the mass resolution, ∆φ, single and pair p T distributions are well reproduced by the FastMC (see Fig. 36). Small discrepancies are observed in the azimuthal opening angle distribution