Measurement of the production cross section for Z + b jets in proton-proton collisions at $\sqrt{s} =$ 13 TeV

The measurement of the cross section for the production of a Z boson, decaying to dielectrons or dimuons, in association with at least one bottom quark jet are performed with proton-proton collision data at $\sqrt{s} =$ 13 TeV. The data sample corresponds to an integrated luminosity of 137 fb$^{-1}$, collected by the CMS experiment at the LHC during 2016-2018. The integrated cross sections for Z + $\ge$ 1 b jet and Z + $\ge$ 2 b jets are reported for the electron, muon, and combined channels. The fiducial cross sections in the combined channel are 6.52 $\pm$ 0.04 (stat) $\pm$ 0.40 (syst) $\pm$ 0.14 (theo) pb for Z + $\ge$ 1 b jet and 0.65 $\pm$ 0.03 (stat) $\pm$ 0.07 (syst) $\pm$ 0.02 (theo) pb for Z + $\ge$ 2 b jets. The differential cross section distributions are measured as functions of various kinematic observables that are useful for precision tests of perturbative quantum chromodynamics predictions. The ratios of integrated and differential cross sections for Z + $\ge$ 2 b jets and Z + $\ge$ 1 b jet processes are also determined. The value of the integrated cross section ratio measured in the combined channel is 0.100 $\pm$ 0.005 (stat) $\pm$ 0.007 (syst) $\pm$ 0.003 (theo). All measurements are compared with predictions from various event generators.


Introduction
A Z boson accompanied by bottom (b) jets can originate from quark-gluon, quark-antiquark, and gluon-gluon interactions in proton-proton (pp) collisions. These processes can be used to test perturbative quantum chromodynamics (pQCD) predictions by comparing the measured cross sections with theoretical predictions. They also provide information on the b quark parton distribution functions (PDFs) [1]. The measurement of Z + b jets helps to estimate the uncertainty coming from the PDF choice in the W boson mass measurement [2]. These processes are the dominant background for Higgs boson production in association with a Z boson (ZH, H → bb) in the standard model. They are also a source of background in many scenarios beyond the standard model such as the production of supersymmetric Higgs bosons in association with a b quark, and new generations of heavy quarks decaying to a Z boson and a b quark. Examples of Feynman diagrams of the signal process with quark-gluon, quark-antiquark, and gluon-gluon interactions for Z + 1 b jet and Z + 2 b jets final states are shown in Fig. 1. Two different approaches are available for calculating the Z + b jets production, the five-flavor and four-flavor schemes, and, as shown in Ref. [3], they both yield consistent results. Previous measurements have been performed on Z + b jets at √ s = 1.96 TeV by the CDF [4] and D0 [5] Collaborations at the Fermilab Tevatron in proton-antiproton collisions. At the CERN LHC similar measurements have been performed at 7 TeV by the CMS [6] and ATLAS [7] Collaborations, at 8 TeV by CMS [8], and at 13 TeV by ATLAS [9], all in pp collisions.
In this paper, the integrated cross sections together with differential distributions of the Z(→ ) + ≥ 1 b jet and Z(→ ) + ≥ 2 b jets are reported (where = ee or µµ). The differential distributions normalized to the corresponding fiducial cross sections are presented as well. In the measurements of normalized distributions, several systematic uncertainties cancel, which allows a more precise comparison with theory. The differential and normalized differential cross section distributions for Z + ≥ 1 b jet production are measured with respect to six kinematic observables: the transverse momentum of the dilepton (p Z T ); the highest p T (leading) b jet (p b jet T ); the absolute pseudorapidity of the b jet (|η b jet |); the azimuthal angle difference between the Z boson and the b jet (∆φ (Z,b jet) ); the rapidity difference between the Z boson and the b jet (∆y (Z,b jet) ); and the angular correlation variable, which is defined as the separation in the pseudorapidity and azimuthal distance between the Z boson and the b jet [∆R (Z,b jet) = ∆η (Z,b jet) 2 + ∆φ (Z,b jet) 2 ].
The differential and normalized differential cross sections for the Z + ≥ 2 b jets production are measured as functions of nine kinematic observables describing the final state: the transverse momenta of the Z boson and the two highest p T (leading and subleading) b jets; the absolute pseudorapidity of the highest p T b jet; the angular correlations between the Z boson and the b jets; the invariant mass of the two b jets (m bb ); the invariant mass of the Z boson and the two b jets (m Zbb ); the angular separation between two b jets (∆R bb ); the minimum separation Several Monte Carlo event generators are used to simulate signal and background processes. The Drell-Yan (DY) process with exclusive jet multiplicity up to 2 is simulated at next-toleading order (NLO) precision by MADGRAPH5 aMC@NLO (denoted MG5 aMC) [16] version 2.3.2.2 for 2016 data and version 2.6.0 for the 2017-2018 data with the FXFX [17] matching between the jets from matrix element calculations and parton showers. The NNPDF 3.0 NLO and NNPDF 3.1 next-to-NLO (NNLO) PDF sets [18] are used for the 2016 and 2017-2018 datataking periods, respectively.
A second sample is produced with MG5 aMC version 2.2.2 (2.4.2) for the 2016 (2017-2018) datataking period using leading order (LO) matrix elements for pp → Z + n jets (n ≤ 4), with the MLM matching scheme [19]. The NNPDF 3.0 LO and NNPDF 3.1 NNLO PDF sets are used for the 2016 and 2017-2018 data-taking periods, respectively. The MG5 aMC generator is interfaced with PYTHIA v8.212 (v8.230) [20] for parton showering, hadronization, and the underlying event simulation using tune CUETP8M1 [21] (CP5 [22]) for the 2016 (2017-2018) data-taking period. Matching between the matrix element generators and the parton shower jets is done with the matching scales set at 19 and 30 GeV for the LO and NLO MG5 aMC generators, respectively.
A third inclusive sample has been produced with SHERPA v2.2.4 [23] to generate pp → Z + n jets events, with n ≤ 2 at NLO and n = 3, 4 at LO. The merging with the SHERPA parton shower is done via the MEPS@NLO prescription [24][25][26] with a matching scale of 20 GeV. The NNPDF 3.0 NLO PDF and a dedicated set of tuned parton shower parameters developed by the SHERPA authors are used. In the matrix element calculation, the value of the NNPDF 3.0 strong coupling (α S ) at the Z boson mass is set to 0.130 (0.118) for MG5 aMC at LO and is set to 0.118 (0.118) for MG5 aMC at NLO for the 2016 (2017-2018) data-taking period. All three signal samples use the five-flavor number scheme with b quark density allowed in the initial state via a b and charm (c) quark PDF, with the b and c quarks typically being massless.
The DY process includes signal events with a Z boson and a jet initiated by a b quark (b jet) as well as background events from Z + c jet and Z + light jet in which a jet is initiated by a charm quark (c jet) and a light quark or gluon (light jet), respectively. All DY processes are scaled to NNLO accuracy in QCD based on the Z + jets cross sections calculated with FEWZ v3.1 [27]. Other background contributions come from top quark-antiquark (tt) and single top quark (sand t-channel, and tW) production as well as from diboson (WW, WZ, and ZZ) processes. The tt process is generated at NLO by POWHEG v2.0 [28][29][30] in 2016 and MG5 aMC in 2017-2018 data-taking periods and interfaced with PYTHIA v8.212 (CUETP8M2T4 [31]), v8.226 (CP5), and v8.230 (CP5) for 2016, 2017, and 2018, respectively, for parton showering and hadronization. The single top quark t-channel and tW processes are simulated at NLO with POWHEG v2.0 and POWHEG v1.0, respectively, whereas the s-channel process is simulated with MG5 aMC at NLO, both interfaced with PYTHIA. The PYTHIA v8.212, v8.212, and v8.205 are used for single top quark t-channel, tW, and s-channel processes, respectively, in 2016, while PYTHIA v8.226 and v8.230 are used in 2017 and 2018, respectively, for all single top quark process. The diboson background processes are simulated at LO using PYTHIA v8.205, v8.226, and v8.230 for 2016, 2017, and 2018, respectively.
All simulation samples, except those based on SHERPA, use PYTHIA with the CUETP8M1 or CUETP8M2T4 or CP5 tune. The CUETP8M1 tune includes the LO NNPDF 2.3 [32] PDFs set with the strong coupling α S (m Z ) set 0.137 for space-and time-like shower simulation. The CUETP8M2T4 tune is based on the CUETP8M1 tune, which includes the NNPDF30 lo as 0130 PDF set, but uses a lower value of α S = 0.111 for the initial-state radiation component of the parton shower. The CP5 tune use the NNPDF3.1 PDF set at NNLO, with α S values of 0.118, and running according to NLO evolution. The diboson and single top quark predictions are scaled to NNLO accuracy [33,34]. The prediction for tt production is normalized to NNLO calculations in pQCD including resummation of next-to-next-to-leading logarithmic soft-gluon terms with TOP++ 2.0 [35][36][37][38][39][40][41].
The MC samples are processed with the full CMS detector simulation based on GEANT4 [42]. The simulated events are reconstructed using the same algorithms used for the data, and include additional interactions per bunch crossings, referred to as pileup (PU). Simulated events are weighted so that the PU distribution reproduces the one observed in data, which has an average of about 23 (32) interactions per bunch crossing in 2016 (2017-2018).

Event selection
The final state in this analysis contains a Z boson (Z → ee/µµ) and at least one jet coming from a b quark. Therefore, the measurement requires the reconstruction of electrons, muons, and b jets. Events are reconstructed using a particle-flow (PF) algorithm [43], which identifies each individual particle candidate using information from various subdetectors. Energy deposits are measured in the calorimeters, and charged particles are identified in the central tracking and muon systems.
The collision events are required to pass single lepton triggers. They must have at least one electron (muon) candidate with a minimum p T of 27, 32, and 32 GeV (24, 27, and 24 GeV) during the 2016, 2017, and 2018 data-taking periods, respectively.
The candidate vertex with the largest value of summed physics-object p 2 T is the primary pp interaction vertex. The physics objects are the leptons and jets, clustered using the anti-k T algorithm [44] with the tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum ( p miss T ), the negative vector p T sum of all tracks. The leptons in the analysis are required to originate from the primary vertex.
Electrons are identified as a primary charged particle track and potentially many ECAL energy clusters. These clusters correspond to the electron track extrapolation to the ECAL and to possible bremsstrahlung photons emitted by electrons along the way through the tracker material [45]. The electron momentum is estimated by combining the energy measurement in the ECAL with the momentum measurement in the tracker. The longitudinal (transverse) impact parameters with respect to the beam line for the barrel (|η| < 1.48) and endcap (1.48 < |η| < 3.0) regions are required to be less than 0.10 (0.05) cm and 0.20 (0.10) cm, respectively. The momentum resolution for electrons with p T ≈ 45 GeV ranges from 1.7 to 4.5% depending on η. It is generally better in the barrel region than in the endcaps, and also depends on the bremsstrahlung energy emitted by the electron as it traverses the material in front of the ECAL. The dielectron mass resolution for Z → ee decays when both electrons are in the ECAL barrel is 1.9%, and degrades to 2.9% when both electrons are in the endcaps.
Muons are measured in the pseudorapidity range |η| < 2.4. The single muon trigger efficiency exceeds 90% over the full η range, and the efficiency to reconstruct and identify muons is greater than 96% for p T > 25 GeV. Efficiencies are measured using the data-driven tag-andprobe methods. Matching muons to tracks measured in the silicon tracker results in a transverse momentum resolution for muons with p T up to 100 GeV of 1% in the barrel and 3% in the endcaps [46]. The corrections for the muon transverse momentum scale and resolution [47] are also applied to remove biases of the muon momentum from detector misalignments or magnetic field uncertainties.
To reduce the misidentification rate, leptons are required to be isolated. The relative isolation variable is defined as: which is the sum of the p T values for all PF candidates within a cone of radius ∆R = 0.3 (0.4) centered on the electron (muon) track direction. Here, ∑ E neutral Ti and ∑ E γ Ti are the scalar sums of the transverse energies of neutral hadrons and photons, respectively. The quantity ∑ p charged Ti represents the p T sum of the charged hadrons associated with the primary vertex and p PU T is the contribution from PU. For electrons, p PU T is evaluated using the "jet area" method described in [48]. For muons, p PU T is taken to be half of the scalar p T sum deposited in the isolation cone by charged particles not associated with the primary vertex. The factor of one half corresponds approximately to the ratio of neutral to charged hadrons produced in the hadronization of PU interactions. Finally, p T stands for the lepton (electron or muon) transverse momentum. For muons I rel is required to be less than 0.15. For electrons I rel is p T dependent; for 25 GeV I rel < 0.021 (0.040) and for 35 GeV I rel < 0.015 (0.029) in the barrel (endcap) region [49].
The small residual differences in the lepton (electron and muon) trigger, identification, and isolation efficiencies between data and simulation are measured using a "tag-and-probe" method [50] and are included by applying scale factors to simulated events.
Jets are clustered from PF candidates using the infrared-and collinear-safe anti-k T algorithm with a distance parameter of 0.4, as implemented in the FASTJET package [51]. The jet momentum is the vector sum of all particle momenta in the jet, and is found from simulation to be within 5-10% of the true momentum over the entire p T spectrum and detector acceptance. Pileup interactions could result in additional tracks and calorimetric energy depositions, possibly increasing the measured jet momentum. To mitigate this effect, tracks identified as originating from PU vertices are discarded and an offset correction [48] is applied to correct for remaining contributions.
The reconstructed jet energy scale (JES) is corrected using a factorized model to compensate for the nonlinear and nonuniform response in the calorimeters. Corrections are derived from simulation to bring, on average, the measured response of jets to that of particle-level jets. In situ measurements of the momentum balance in dijet, multijet, photon + jet, and leptonically decaying Z + jet events are used to account for any residual differences between the JES in data and simulation [52]. The jet energy resolution (JER) in the simulation is smeared to reproduce what is observed in data. The JER amounts typically to 16% at 30 GeV and 8% at 100 GeV. Additional selection criteria are applied to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures [53]. Jets identified as likely to originate from PU are also removed by using a pileup jet ID discriminator for jets with 30 < p T ≤ 50 GeV [54]. For jets with p T > 50 GeV, the pileup jet ID discriminator is not applied, since here the pileup event contribution is negligible [55].
The missing transverse momentum vector p miss T is computed as the negative vector sum of the transverse momenta of all the PF candidates in an event, and its magnitude is denoted as p miss T [56]. The p miss T is modified to include corrections to the energy scale of the reconstructed jets in the event.
A neural network algorithm (DEEPCSV) [57] is used to discriminate b jets from c and light jets. In the DEEPCSV algorithm, the significance of the secondary vertex displacement is combined with track-based information in a machine-learned multivariate analysis to increase the efficiency compared with cut-based algorithms.
The Z boson candidate is reconstructed from two same flavor leptons with opposite charge having invariant mass (m ) between 71-111 GeV. Here, leading and subleading leptons must have p T > 35 GeV and p T > 25 GeV, respectively. The Z boson event must contain at least one b jet with p T > 30 GeV, |η| < 2.4, and have p miss T < 50 GeV. The b jets are selected with the tight operating point of the b tag discriminator. The overlap between the b jet and a lepton from the Z boson decay is removed by requiring a minimum distance of ∆R > 0.4 between them.
In simulation, the classification of reconstructed Z + jet events into Z + b jet, Z + c jet, and Z + light jet categories is based on the flavors of reconstructed jets with p T > 30 GeV and |η| < 2.4. They are classified as b or c jets if they are matched to the generated b or c hadrons. The matching is done using the ghost association procedure in which the modulus of the hadron four-momentum is set to a small number, retaining only the directional information [57]. In the case when both c and b hadrons are matched, the jet is considered as a b jet. Based on reconstructed jets with defined flavors, events are classified as Z + b jet events if they contain at least one b jet. Of the remaining events, those that contain at least one c hadron are considered as Z + c jet events and those that contain neither b nor c hadrons are classified as Z + light jet events. In this analysis the tagging efficiency of Z + b jet (Z + c jet) events is estimated as the fraction of events that pass the b (c) hadron and b tagging (c tagging) requirement over the number of events that pass just the b (c) hadron requirement. In simulations for the tight operating point, the tagging efficiencies for b (c) quarks are 50-60% (25%) and misidentification probabilities are about 0.1% (<1.5%) and 2-5% (≈24%) for a light (light) and c (b) jet.
Data from tt, W + jets, and multijet events that are enriched with a given flavor of interest are used to measure the b, c, and light jet tagging efficiencies. These are compared with the efficiencies calculated from simulation. Small differences between data and simulation are corrected through scale factors applied to simulation.
The selected events with a Z boson and at least one b jet have ≈18% background contamination, with the major contributions arising from the Z + c jet (≈7%), tt (≈7%), and Z + light jet (≈4%) processes. In the Z + ≥ 2 b jets signal region a significant source of background originates from tt production, accounting for ≈30% of the events in the signal region and ≈85% of the background. To validate the simulation, control regions that are enriched with processes of interests are constructed.
For the Z + ≥ 1 b jet measurement, the tt control region is constructed by requiring an oppositely charged electron and muon pair with at least one b jet (eµ + ≥ 1 b jet) with the same selection criteria discussed above. A sample containing about 92% of tt events is obtained for each year of data taking. The non-tt contributions to this region are DY, single top, W + jets, and diboson events. A scale factor, (data − non-tt background)/(tt MC), is extracted from this region. It is 0.98 for 2016 and 0.87 for 2017-2018 data-taking periods and is applied in the signal region to the tt simulation in the ee and µµ channel.
In the Z + ≥ 2 b jets signal region, the tt background is estimated by a data-driven method using an eµ + ≥ 2 b jets sample where the purity of the tt events is ≈96%. A normalization factor is derived by comparing the numbers of tt events seen in the sidebands of the dilepton invariant mass distributions from eµ + ≥ 2 b jets and signal (Z + ≥ 2 b jets) regions. This factor is applied to the number of tt (→ eµ) + ≥ 2 b jets events within the Z boson mass window to obtain the tt background contribution in the signal region.
The Z + light jet control region (CR1) is constructed by requiring events to have inclusive jets without any condition on the tagger discriminator. A sample containing about 83-85% of Z + light jet events is obtained for each year of data taking. The non-(Z + light jet) backgrounds are Z + c jet, Z + b jet, diboson, tt, single top, and W + jets events. A scale factor, (data − non-(Z + light jet) background)/(Z + light jet MC), is extracted from this region and applied in the signal region to the Z + light jet MC. The scale factors 0.94, 0.98, and 0.90 are used for the 2016, 2017, and 2018 data-taking periods, respectively. As a cross-check, another Z + light jet control region (CR2) is defined by requiring events to pass anti-b and anti-c tagging tight operating points of b and c quark discriminators. The scale factors in CR1 and CR2 are consistent within 1-2%.
The Z + c jet control region (CR3) is constructed by requiring c jet events with the tight operating points of the c tag discriminator. A sample containing about 41-55% of Z + c jet events is obtained for each year of data taking. Scale factors, (data − non-(Z + c jet) background)/(Z + c jet MC), 0.75, 1.08, and 0.99 for 2016, 2017, and 2018 data-taking periods, are extracted from this region and are applied to the Z + c jet MC sample in the signal region. The non-Z + c jet contributions to this region are Z + b jet, Z + light jet, tt, single top, W + jets, and diboson events. As a cross-check, another Z + c jet control region (CR4) is defined by requiring events to pass tight operating points of the c quark discriminator and inverted b quark discriminator. The scale factors in CR3 and CR4 are consistent within 1-2%.
The collective contribution of single top, W + jets, and diboson processes is small (<2%). Thus, data-driven estimates are not considered for these processes and their contributions are determined from simulation.

Unfolding and systematic uncertainties
The reconstructed distributions are unfolded to generator-level quantities using the TUNFOLD package (v17.5) [58], which is based on a least squares fit, to correct for the detector resolution and the selection efficiencies. The simulated events are classified according to generator-level information. The generator-level jets are formed from stable particles (cτ > 1 cm), except neutrinos, using the same anti-k T algorithm as for the reconstructed jets. Any overlap between generator-level jets and a pair of leptons from the Z boson decays is removed by requiring a minimum distance of 0.4 between them. If an event consists of at least one (two) generator-level jet(s) containing a b hadron, the event is classified as a Z + ≥ 1 b jet (Z + ≥ 2 b jets) event. The generator-level leptons are "dressed" by adding the momenta of all photons within ∆R ≤ 0.1 around the lepton direction to account for the final-state radiation effects. The distributions are unfolded to the generator-level in the fiducial region defined in Table 1.
The reconstructed distributions are unfolded with a response matrix that describes the migration probability between the particle-and reconstruction-level quantities. The response matrix is constructed by spatially matching the reconstruction-level leptons and b jet objects to the corresponding generator-level objects within ∆R < 0.3. The selected Z + ≥ 1 b jet and Z + ≥ 2 b jets samples at reconstruction-level can have background events that do not match to generator-level objects. The background events are channel dependent and they are subtracted from reconstructed events for each year and channel separately. The background subtracted reconstruction-level distributions are unfolded. The unfolded distributions are corrected for those events that have generator-level objects in the fiducial volume, which do not match with reconstructed objects. Events generated with MG5 aMC (NLO) and CP5 tune are used to construct the response matrix. The MG5 aMC (NLO) with CUETP8M1 tune, MG5 aMC (LO) with CP5 and CUETP8M1 tunes, and SHERPA are used for comparison with unfolded data.
The effect of regularization, a procedure to control large fluctuations of unfolded distributions due to statistical uncertainties of observed quantities [59], is negligible for the selected ob-servables. Therefore, all studies are performed without regularization. The unfolding procedure is validated with statistically independent tests in which MG5 aMC NLO distributions are unfolded with LO MG5 aMC and compared with generator-level NLO MG5 aMC predictions. There is good agreement between the unfolded and the generator-level predictions. The resolution of variables to be measured in the present analysis is not expected to differ for the electron and muon channels. This procedure is validated by unfolding generator-level MC distributions for the electron channel with the response matrix constructed for the muon channel. The background and acceptance distributions are taken from the electron channel only. The resulting unfolded distributions are compared with the unfolded results when the electron channel is used for the construction of the response matrix. A good agreement is observed between the two unfolded results, which establishes that the response matrices for the two channels are consistent with each other. Therefore, the electron and muon data are added and unfolded with the combined response matrix.
The data distributions for the Z + ≥ 1 b jet (Z + ≥ 2 b jets) analysis are unfolded with a response matrix constructed using the Z + b jet events in the DY sample that is simulated with the MG5 aMC (NLO) generator.

Experimental uncertainties
There are a number of sources of uncertainty that have a significant impact on the cross section measurements. These include For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) analysis, the JES introduces a systematic uncertainty of ≈3% (≈6%) on the integrated cross section whereas the effect of JER is less than 1%. The JES has varying effects on the differential and normalized differential cross section distributions for the different observables of Z + ≥ 1 b jet (Z + ≥ 2 b jets) events, with maximum effects of 9 and 7% (24 and 18%), respectively, in the low event count bins.
• b tagging/mistagging: The uncertainty due to b tagging and mistagging scale factors is evaluated by varying them up and down within their uncertainties. The scale factor for tagging b jets is correlated with the c jets scale factor, so for tagging b and mistagging c jets, scale factors are varied up and down simultaneously, while the scale factor for mistagging light jets is varied independently. The combined b tagging uncertainty is then calculated as the sum in quadrature of these variations.
For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) analysis the uncertainties in the b tagging and mistagging scale factors affect the cross section measurements by 3.0% (5.8%). The differential and normalized differential cross section distributions are affected by 2-8% and 0.5-5.0%, respectively, for selected observables of Z + ≥ 1 b jet events. These uncertainties are less than 14.5 and 9.5% in the Z + ≥ 2 b jets differential and normalized differential cross section measurements, respectively.
• Unclustered energy of p miss T : The effect of the uncertainty in the remaining component of p miss T , that is unclustered energy, is estimated by varying its contributions within the uncertainties, which are treated as correlated across years and channels. For Z + ≥ 1 b jet (Z + ≥ 2 b jets), this uncertainty source affects the integrated cross section by up to 2.8% (3.6%). The differential and normalized differential cross section distributions of different observables of Z + ≥ 1 b jet (Z + ≥ 2 b jets) events are affected by up to 3.8% (8.4%) and 1.1% (7.1%), respectively.
• Lepton selection: Electrons and muons are corrected for trigger, reconstruction, and identification efficiencies and the effects of these corrections are estimated by varying the corresponding scale factors within their uncertainties. For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) measurement, the uncertainty in the electron selection efficiency results in an uncertainty in the integrated cross section of about 4.6% (4.3%) for the electron channel, but for the combined channel the effect is about 1.5% (1.4%). Normalized distributions are not biased due to electron selection efficiency because associated uncertainties cancel in the ratio. The integrated cross section has a systematic uncertainty due to muon selection of less than 1%.
• Pileup reweighting: The PU distribution in the simulated samples is reweighted to match that of the data. The corresponding uncertainty is estimated by varying the total inelastic cross section by ±4.6% [63]. For Z + ≥ 1 b jet (Z + ≥ 2 b jets) the uncertainty in the PU simulation affects the integrated cross section by 1.7-2.4% (2.1-2.9%). For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) differential and normalized differential cross section distributions the effects are less than 2.7% (6.8%) and 1.3% (4.3%), respectively.
• Background estimation: In the Z + ≥ 1 b jet cross section measurements the Z + c jet and Z + light jet contributions come from simulation and are validated in control regions as discussed in Section 4. The difference between data and MC simulation is extracted from each control region and is applied as a scale factor for the Z + c jet and Z + light jet MC predictions in the signal region, respectively. These scale factors can have associated systematic uncertainties from experimental sources (JES, b tagging, PU, lepton selection), although these are already considered while evaluating systematic uncertainties in the signal region. For all of the control regions the data and MC simulation values are within 10%. Hence, an additional systematic uncertainty of 10% is assigned to the background modeling. For Z + ≥ 1 b jet, the effect of these uncertainties is ≈2.2% on the integrated cross section, whereas the effect of uncertainties related to Z + c jet and Z + light jet backgrounds is about 1.5%. The corresponding effects on the differential and normalized differential cross section distributions range from 1.5 to 4.8% and 0.5 to 3.3%, respectively. In the Z + ≥ 2 b jets measurements the background uncertainty originates from the tt, Drell-Yan, diboson, and single top estimations. The tt uncertainty is the dominant source, and is derived by comparing the normalization coefficients obtained using fits in the dilepton invariant mass and missing transverse momentum distributions. The background uncertainty in the Z + ≥ 2 b jets integrated cross section is 2.4%. In the Z + ≥ 2 b jets differential and normalized differential cross sections this uncertainty is less than 4.9 and 2.7%, respectively.
• Model dependence: The unfolding model uncertainty is estimated by reweighting the signal MC (MG5 aMC at NLO) with a scale factor that is calculated by fitting the ratio of background-subtracted data to signal simulation of normalized distributions and using it as an alternative model for the unfolding. It affects the differential and normalized differential cross section distributions of different observables in the Z + ≥ 1 b jet analysis up to 1.5 and 1.2%, respectively. • Pileup jet identification: The PU jet identification criteria have different performance in the data and simulated events. This difference is corrected as a scale factor. For Z + ≥ 1 b jet and Z + ≥ 2 b jets measurements, the effect of uncertainties associated with the scale factor is less than 0.7% in the integrated cross section. The differential and normalized differential cross section distributions of different observables of Z + ≥ 1 b jet (Z + ≥ 2 b jets) events are affected up to 0.5% (1.2%). • L1 prefiring: During the 2016-2017 data-taking period there was a timing issue in the 2.0 < |η| < 3.0 region of the ECAL where L1 trigger primitives were mistakenly associated with the previous bunch crossing. This led to a self-vetoing of events with significant energy in the 2.0 < |η| < 3.0 region and a loss of trigger efficiency. Correction factors are determined from data for the efficiency loss and applied to the simulation of the dielectron and dimuon channels. The uncertainty due to the prefiring scale factors is evaluated by varying them up and down within their uncertainties. For the Z + ≥ 1 b jet and Z + ≥ 2 b jets differential and normalized differential cross section distributions as well as on the integrated cross section the effects are less than 0.5%.
In the cross section ratio measurement, the luminosity uncertainty is effectively cancelled. All other experimental uncertainties except the background estimation are evaluated as follows.
For an uncertainty of interest, the Z + ≥ 2 b jets and Z + ≥ 1 b jet cross sections are obtained for up and down systematic variations described above and the corresponding ratios are obtained. The uncertainty is derived by comparing these ratios with the nominal ones. The background estimations in the Z + ≥ 2 b jets and Z + ≥ 1 b jet events are based on independent methods and control regions. Therefore, the background uncertainties are uncorrelated and derived from quadrature sums of those from the Z + ≥ 2 b jets and Z + ≥ 1 b jet measurements.

Theoretical uncertainties
The uncertainties coming from the theory calculations and MC generators are estimated as follows.
• Renormalization and factorization scales: The scale uncertainties are estimated using a set of weights provided by the generator that corresponds to variations of renormalization (µ R ) and factorization (µ F ) scales by factors of 0.5 and 2. The unfolded distributions are obtained for all combinations, except for µ F /µ R = 0.25 or 4, and their envelope is quoted as the uncertainty. The integrated cross sections for Z + ≥ 1 b jet (Z + ≥ 2 b jets) have scale uncertainties of 2.6% (2.5%), 2.9% (2.3%), and 2.1% (2.5%) in the dielectron, dimuon, and combined channels, respectively. The uncertainties in the Z + ≥ 1 b jet (Z + ≥ 2 b jets) differential distributions vary from 1 to 10% (0.5 to 4.9%) and for the normalized differential distributions they vary from 0.5 to 9.4% (0.5 to 5.0%). Similarly, uncertainties are evaluated for MG5 aMC (NLO) with CP5 tune generator-level distributions and the effect on the Z + ≥ 1 b jet (Z + ≥ 2 b jets) integrated cross section at the generator-level is around 6.6% (9.0%). The renormalization and factorization scale uncertainties in the cross section ratios are the quadrature sum of those from the Z + ≥ 2 b jets and Z + ≥ 1 b jet measurements. • PDF: The uncertainty in the parton distribution functions is estimated using different Hessian eigenvectors of the NNPDF 3.1 PDF sets. Applying the master formula of the Hessian PDF error calculation [64], this uncertainty is estimated on a bin-by-bin basis for the differential distributions. The integrated cross sections for the Z + ≥ 1 b jet (Z + ≥ 2 b jets) process have PDF uncertainties around 0.3-0.4% (0.3%) in the dielectron, dimuon, and combined channels. For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) analysis, the uncertainties in the differential distributions for the dielectron, dimuon, and combined channels are 0.5 to 4.2% (0.5-4.0%), whereas the uncertainties in the normalized differential cross sections for these channels are in the range 0.5-3.3% (0.5-3.2%). The PDF uncertainties are treated as correlated across the years and channels. Similarly, the effect of PDF (NNPDF 3.1) uncertainty in the generator-level distributions is obtained on a bin-by-bin basis of generator-level distributions with PDF weights applied. The effect on the integrated cross sections of the Z + ≥ 1 b jet and Z + ≥ 2 b jets at the generator-level is around 1.0%. The Z + ≥ 2 b jets and Z + ≥ 1 b jet PDF uncertainties are summed in quadrature for the cross section ratios. The uncertainties in the PDFs are also estimated for SHERPA at generator-level using the 102 replicas of the NNPDF 3.0 PDF set. The integrated cross sections for the Z + ≥ 1 b jet (Z + ≥ 2 b jets) process have PDF uncertainties around 1.4% and 2.0%.
• Strong coupling (α S ): The uncertainties due to α S are estimated by using two additional PDF members corresponding to upper (0.120) and lower (0.116) values with respect to the central (0.118) α S value. The integrated cross sections for the Z + ≥ 1 b jet (Z + ≥ 2 b jets) process have α S uncertainties around 0.2-0.3% (0.1%) in the dielectron, dimuon, and combined channels. For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) analysis, the uncertainty in the differential distributions for the dielectron, dimuon, and combined channels is 0.5-1.3% (0.5-1.6%), and the uncertainties in the normalized differential cross sections for these channels are in the range 0.5 to 1.1% (0.5-1.3%). The effect on Z + ≥ 1 b jet (Z + ≥ 2 b jets) integrated cross section at the generator-level is 1.2% (0.1%). The Z + ≥ 2 b jets and Z + ≥ 1 b jet α S uncertainties are summed in quadrature for the cross section ratios. Table 2 lists the uncertainties in the Z + ≥ 1 b jet and Z + ≥ 2 b jets integrated cross sections. Tables 3 and 4 summarize the range of uncertainty in the differential and normalized differential cross section distributions for the Z + ≥ 1 b jet combined channel, respectively. For the Z + ≥ 2 b jets measurements, Tables 5 and 6 show the uncertainty in the differential and normalized differential cross sections, respectively. All systematic uncertainties are treated as 100% correlated across the years. In summary, the Z + ≥ 1 b jet (Z + ≥ 2 b jets) integrated cross sections have total experimental systematic uncertainties of 5.9 and 6.1% (9.7 and 9.9%), respectively, in the dimuon and combined channels. For the electron channel the uncertainty is a bit higher, up to 7.6% (Z + ≥ 1 b jet) and 11.4% (Z + ≥ 2 b jets), owing to the electron selection uncertainty. For the Z + ≥ 1 b jet (Z + ≥ 2 b jets) measured integrated cross section the statistical uncertainty is less than 1% (5%).  Table 3: Summary of the uncertainties (in percent) in the differential cross section distributions for the combined channel in the Z + ≥ 1 b jet events.  Table 4: Summary of the uncertainties (in percent) in the normalized differential distributions for the combined channel in the Z + ≥ 1 b jet events.

Results
The measured Z + ≥ 1 b jet and Z + ≥ 2 b jets integrated cross sections in the fiducial region described in Table 1, are 6.52 ± 0.04 (stat) ± 0.40 (syst) ± 0.14 (theo) pb and 0.65 ± 0.03 (stat) ± 0.07 (syst) ± 0.02 (theo) pb in the combined channel, respectively. The measurements are consistent for the dielectron, dimuon, and combined channels as reported in Table 7, therefore the differential and normalized differential distributions are shown only for the combined channel. The measured cross sections are compared with predictions from MG5 29,38, and 21%, respectively. The measured value of the cross section ratio of the Z + ≥ 2 b jets and Z + ≥ 1 b jet for the combined channel is 0.100 ± 0.005 (stat) ± 0.007 (syst) ± 0.003 (theo) and, as expected, has smaller uncertainties. This measurement is in agreement with the MG5 aMC (LO, NNPDF 3.0, CUETP8M1) and SHERPA values given the large uncertainties associated with the predictions. Table 7 summarizes the measured and predicted cross sections for the Z + ≥ 1 b jet and Z + ≥ 2 b jets processes. Table 7: Measured and predicted cross sections (in pb) for the Z + ≥ 1 b jet and Z + ≥ 2 b jets final states. The cross section ratios between the Z + ≥ 2 b jets and Z + ≥ 1 b jet are shown in the last three rows for the dielectron, dimuon, and combined channels. In the measured results the first, second, and third uncertainties correspond to the statistical, systematic, and theoretical sources, respectively. The MG5 aMC (NLO) predictions include theoretical uncertainties (PDF, and renormalization and factorization scales). The measured differential cross section distributions and the corresponding ones normalized by the integrated fiducial cross sections compared with different MC predictions in the combined channel are shown in Figs. 2-16. In general, the differential cross section distributions predicted by SHERPA tend to overestimate data by 20-30%, depending on the kinematic variables.
The differential and normalized differential cross section distributions as functions of the p Z T in the selected Z + ≥ 1 b jet events are shown in Fig. 2. The shapes of these distributions are described best by MG5 aMC (LO, NNPDF 3.1, CP5), while MG5 aMC (NLO) and SHERPA predictions vary up to 30% depending on p Z T . The p b jet T and |η b jet | differential and normalized differential cross section distributions are shown in Figs. 3 and 4, respectively. The shapes of distributions are well described by all simulations, except for the MG5 aMC (LO, NNPDF 3.1, CP5) p b jet T spectrum, which deviates up to 25% in the higher p T region. The differential and normalized differential cross section distributions of ∆φ (Z,b jet) , ∆y (Z,b jet) , and ∆R (Z,b jet) are shown in Figs. 5-7. The shapes of these distributions are best described by the SHERPA predictions. The MG5 aMC (LO) predictions show the largest deviation from data in the high ∆y (Z,b jet) and ∆R (Z,b jet) regions, which is significantly improved with the MG5 aMC (NLO) prediction. In summary, SHERPA simulations provide a good description of different kinematic observables except for the higher p Z T region for the normalized differential distributions. The MG5 aMC LO and NLO simulations provide varying levels of agreement for the observables. For the Z + ≥ 2 b jets final states, differential and the normalized differential cross sections as functions of the leading b jet p T and |η|, subleading b jet p T , and Z boson transverse momentum are shown in Figs. 8-11. The shapes of distributions in data are well-described by predictions from MG5 aMC (LO and NLO; with both PDFs) and SHERPA, except for a couple of bins in the high-p T and high-|η| regions. In the differential cross sections MG5 aMC (NLO, NNPDF 3.1, CP5), MG5 aMC (NLO, NNPDF 3.0, CUETP8M1), and SHERPA overestimate data by 20-30%, 30-50%, and 20-30%, respectively, whereas the MG5 aMC (LO) predictions are in good agreement with data. The angular correlations and asymmetry distributions of the Z + ≥ 2 b jets    system are presented in Figs. 12-14. Although MG5 aMC (NLO) predictions are consistent with data in the normalized differential cross section distributions, except for the ∆R bb , A Zbb regions of 0.5-1.5 and 0.8-1, respectively, they are generally higher than the measured differential cross sections by 20-30% and 30-50% for MG5 aMC (NLO, NNPDF 3.1, CP5) and MG5 aMC (NLO, NNPDF 3.0, CUETP8M1), respectively. The MG5 aMC (LO) predictions show large deviations from data in the ∆R bb distributions above 3.4, as can be seen in Fig. 12. In the ∆R min Zbb distribu- tion, Fig. 13, the MG5 aMC (LO) predictions are consistent with data, although they are higher than data in the last bin of the A Zbb distribution (0.8-1) in Fig. 14, similar to the MG5 aMC (NLO, NNPDF 3.1, CP5) case. The MG5 aMC (NLO, NNPDF 3.0, CUETP8M1) simulation overestimates data in the differential ∆R min Zbb distribution. The SHERPA predictions describe well the shape of the measured ∆R bb differential cross sections but they significantly overestimate data in the low (<1.2) and high (>0.5) regions of the ∆R min Zbb and A Zbb distributions, respectively. Figure 15 shows invariant mass for two b jets events, and Fig. 16 shows the invariant mass for the Z + ≥ 2 b jets events. The shapes of these distributions are described better by the MG5 aMC (NLO) and SHERPA predictions than the MG5 aMC (LO) ones. As for the rates, the data are described best by the MG5 aMC (NLO, NNPDF 3.1, CP5) predictions.
The σ(Z + ≥ 2 b jets)/σ(Z + ≥ 1 b jet) cross section ratio distributions as functions of the leading b jet transverse momentum and absolute pseudorapidity are shown in Fig. 17. The ratio gradually increases (from 0.05 to 0.25) with the leading b jet p T (ranging from 30 to 200 GeV), but is nearly independent of the pseudorapidity of the leading b jet. The increase in the ratio is due to a kinematic effect; at larger p T of the leading jet, the kinematic acceptance for the subleading jet would increase. Predictions from MG5 aMC (LO), MG5 aMC (NLO), and SHERPA describe the measured ratios within uncertainties.     Figure 9: (left) Differential cross section and the (right) normalized differential cross section distributions as functions of the leading b jet absolute pseudorapidity for the Z + ≥ 2 b jets events.   Figure 12: (left) Differential cross section and the (right) normalized differential cross section distributions as functions of the angular separation between two b jets, ∆R bb for the Z + ≥ 2 b jets events.         Figure 17: Distributions of the cross section ratios, σ(Z + ≥ 2 b jets)/σ(Z + ≥ 1 b jet), as functions of the (left) leading b jet transverse momentum and (right) absolute pseudorapidity.

Summary
A measurement of fiducial cross sections of the Z + ≥ 1 b jet and Z + ≥ 2 b jets processes, along with the differential and normalized differential cross section distributions of different kinematic observables, is performed using proton-proton collisions data at √ s = 13 TeV collected by the CMS experiment at the CERN LHC. This is the first measurement of these processes based on data collected during the 2016-2018 LHC running period. The fiducial cross sections are measured to be 6.52 ± 0.04 (stat) ± 0.40 (syst) ± 0.14 (theo) pb for the Z + ≥ 1 b jet and 0.65 ± 0.03 (stat) ± 0.07 (syst) ± 0.02 (theo) pb for the Z + ≥ 2 b jets, which are better described by the MG5 aMC leading order (LO) simulation but overestimated by MG5 aMC nextto-LO (NLO) and SHERPA predictions. Since all predictions are normalized to the inclusive Z + jets next-to-NLO cross section, differences between MG5 aMC (NLO) and MG5 aMC (LO) results could be attributable to variations in shapes of observables and settings (parton distribution functions, Monte Carlo tunes, matching schemes) used in those simulations. The SHERPA simulation overestimates the measured integrated cross section; however, it provides a good description of the shapes of various kinematic observables. The MG5 aMC (LO) and MG5 aMC (NLO) generators interfaced with PYTHIA describe the fiducial cross section better but do not completely describe the shapes of the kinematic observables. Present measurements can be used as an input for the further optimization of the simulation parameters. The measured value of the cross section ratio of the Z + ≥ 2 b jets to Z + ≥ 1 b jet is 0.100 ± 0.005 (stat) ± 0.007 (syst) ± 0.003 (theo), which is well described by the MG5 aMC (LO, NNPDF 3.0, CUETP8M1) and SHERPA calculations but overestimated by MG5 aMC (NLO) prediction. [8] CMS Collaboration, "Measurements of the associated production of a Z boson and b jets in pp collisions at √ s = 8 TeV", Eur. Phys. J. C 77 (2017) 751, doi:10.1140/epjc/s10052-017-5140-y, arXiv:1611.06507v2.