Measurement of cross-sections for production of a $Z$ boson in association with a flavor-inclusive or doubly $b$-tagged large-radius jet in proton-proton collisions at $\sqrt{s} = 13$ TeV with the ATLAS experiment

We present measurements of cross-sections for production of a leptonically decaying $Z$ boson in association with a large-radius jet in 13 TeV proton-proton collisions at the LHC, using $36~\mathrm{fb}^{-1}$ of data from the ATLAS detector. Integrated and differential cross-sections are measured at particle-level in both a flavor-inclusive and a doubly $b$-tagged fiducial phase-space. The large-radius jet mass and transverse momentum, its kinematic relationship to the $Z$ boson, and the angular separation of $b$-tagged small-radius track-jets within the large-radius jet are measured. This measurement constitutes an important test of perturbative quantum chromodynamics in kinematic and flavor configurations relevant to several Higgs boson and beyond-Standard-Model physics analyses. The results highlight issues with modeling of additional hadronic activity in the flavor-inclusive selection, and a distinction between flavor-number schemes in the $b$-tagged phase-space.


Introduction
Since the proposal of a Higgs →  b discovery channel based on the structure of a high-momentum jet [1], "boosted jet" methods have been a major feature of Higgs and other experimental analyses at the LHC [2][3][4][5][6][7].In these, a large-radius jet (large- jet) is reconstructed, and then decomposed into smaller-radius (small-) subjets whose structure enables identification of resonances whose decay products have been collimated by their parent's large momentum.Such methods are also of high interest in searches for new physics, both because a high-mass new particle decaying into resonances naturally generates high-momentum merged jets, and because the high-momentum regime is particularly sensitive to modifications to Standard Model (SM) dynamics by new physics.
A good understanding of hard, collinear parton splittings in quantum chromodynamics (QCD) is key to measurements using large- jet methods, as these constitute the dominant background to boosted resonance signatures.Monte Carlo (MC) event-simulation methods include approximations such as factorization of partonic splittings in parton-shower algorithms, and slicing of emission phase-space between matrix-element and parton-shower sources.The extreme dynamics of boosted topologies include aspects of both collinear and high- T physics, presenting a test for such factorizations.
The role of heavy-quark masses is also important: contradictory inclusions of mass effects in calculations for parton-shower MC generators and parton distribution functions (PDFs) mean that there is as yet no unambiguously more-correct simulation strategy for heavy-flavor production [8].Additionally, the modeling of gluon-splitting into heavy quarks may benefit from renormalization-scale choices different from those developed for the more common cases of gluon emission from light quarks and gluons.
Due to these factors, empirical comparisons between MC models and collider measurements of boosted  b production are key to understanding and improving the validity of parton-shower MC simulations in this event topology, and closely related ones such as high- T associated Higgs-boson production.In addition, this phase-space is also relatively novel for light-jet production, whose modeling can be similarly informed by equivalent measurements without the -tagging requirement.This analysis follows the   ( b) event-selection strategy of requiring boosted-jet production in association with a leptonically decaying vector boson, which is effective at reducing QCD backgrounds and provides an additional, experimentally clean proxy for each event's characteristic momentum-transfer scale [1].Previous measurements of the +( b) process, in a resolved rather than boosted event topology, were made using √  = 7 TeV proton-proton collisions by ATLAS, CMS and LHCb [9-13], at 8 TeV by CMS [14], and at 13 TeV by ATLAS and CMS [15,16].
Additionally, studies of correlations between  and B mesons have been performed by CMS at 7 TeV [17], and various properties of gluon splitting at 13 TeV have been measured by ATLAS [18].In these papers the differential cross-section as a function of Δ(, b), the angular separation between the -jets or -hadrons, was measured and found to be mismodeled by MC generators in the small-Δ(, b) region typical of gluon splitting into  b.Given the importance of Δ(, b) modeling to techniques for reconstruction of boosted Higgs-bosons, this variable is also a target of this analysis.The higher 13 TeV center-of-mass energy produces a larger boosted-event population than in the 7 TeV studies, and in contrast to the recent ATLAS + b cross-section measurement at 13 TeV, use is also made of small-radius charged-particle jets that allow precise measurement of the small angular separations.
In this paper, we present measurements of cross-sections for the production of a leptonically-decaying -boson in association with a large- jet, using data taken by the ATLAS detector at the LHC [19,20] in 2015-2016.The measured cross-sections are differential in kinematic variables of the large- jet.An additional phase-space is defined by the requirement that the large- jet be doubly -tagged: total and differential cross-sections, including as a function of the angular separation of the -tagged subjets, are also measured in this phase-space.These measurements provide an important test of perturbative QCD in the boosted regime, including contributions both where a high-energy gluon splits to give a  b pair carrying a significant fraction of the jet momentum, and where secondary processes in which two -tagged momentum flows exist within the jet but are not its dominant kinematic components.
The relevant perturbative-QCD issues are summarized in Section 2, the ATLAS detector is described in Section 3, and the MC event samples used in the analysis and for comparison with the resulting measurements are discussed in Section 4. The physics-object definitions at the reconstruction and particle levels are described in Section 5, followed by the event selection and observable definitions in Section 6.The correction for detector biases is treated in Section 7, and the sources and estimation of systematic uncertainties are described in Section 8. Finally, the detector-corrected observables are compared with current MC predictions in Section 9.

Theory context
Heavy-flavor partons in the initial state of a hard-scattering process are understood to arise mainly from perturbative gluon splittings into  b and  c quark-antiquark pairs, formalized as DGLAP QCD evolution [21][22][23].But there is an ambiguity in this evolution in the usual factorization picture of perturbative QCD calculations, as to whether the emergence of heavy flavor is to be isolated into the partonic cross-section σ, or is also permitted in the evolution of the PDFs which encode the initial-state proton structure.At present, this separation is strongly tied to the treatment of the heavy quark as having a finite or zero mass [8].
The picture without heavy-quark production in the PDF evolution -here, the absence of -quarks -is termed the four-flavor number scheme (4FNS).In this, the -quark density in the PDF is set to zero, and so the perturbative generation of initial-state -quarks comes from explicit gluon splitting into a  b pair in the partonic matrix-element [24], usually including -quark mass effects.A consequence is that in the four-flavor scheme there are always at least two participating -quarks, although they may fall outside the experimental acceptance.By contrast, in the five-flavor number scheme (5FNS) the PDF evolution can generate initial-state -quarks -again through gluon splitting, but now internalized in the functional form of the -quark PDF.This allows matrix-element amplitudes where it is possible for only one -quark to participate in the hard-scatter [24].While the 5FNS initially seems the more complete treatment, its treatment of the initial-state -quark is purely longitudinal (whereas gluon splitting in the matrix element generates transverse momentum), and to avoid non-cancellation of higher-order soft divergences the initial-state -quark is currently treated as massless in standard PDF approaches [25].
In a hypothetical all-orders calculation the two schemes would give the same results, but for a truncated perturbation expansion they generally give different predictions.There are arguments in favour of both approaches: the 4FNS allows for transverse momentum exchange through the initial-state heavy quarks and hence might be expected to describe event kinematics better, while the 5FNS is able to make use of higher-order DGLAP resummation calculations in the PDF evolution, which are not present in matrix elements matched to parton showers.The importance of mass effects is expected to become less important for process scales  ≫   , suggesting that higher-accuracy predictions may be obtained using the 5FNS in boosted event configurations [8, [26][27][28].Recent computations such as the computation of  +  production at O ( 3 s ) [29], and the NLO "fusing" scheme developed by Sherpa [30] combine desirable aspects of both schemes, and should also be dominated by the 5FNS in boosted phase-space.It is therefore important to compare experimental measurements of -quark production with predictions using both of these schemes, to empirically test the accuracy and predictivity of the available theoretical approaches.
Theoretical uncertainties also arise in the production of -quarks in the partonic final state.The usual parton shower formulation for parton or dipole splitting is derived in the collinear-emission limit, using the  T of the splitting as the characteristic (renormalization) scale, but this heuristic scale choice is only well-motivated for gluon-emission splitting functions [31].The scale choice for gluon splitting, especially into heavy quarks, is hence an extrapolation requiring empirical testing.This is of great importance since uncertainties in heavy-flavor production by gluon splitting are a leading systematic limitation on the sensitivity of Higgs boson decays into  b in the  t,  (where  is either a  or  boson) and gluon-fusion channels [32][33][34] -particularly in boosted-Higgs configurations where the two -quarks are relatively collinear, similar to the gluon-splitting kinematics [35].

ATLAS detector
The ATLAS detector [19] is a multipurpose particle detector with a forward/backward-symmetric cylindrical geometry.The detector has a nearly 4 coverage in solid angle 1 and consists of an inner tracking detector, electromagnetic and hadronic calorimeters, and a muon spectrometer.
The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range || < 2.5.The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer (IBL) installed before Run 2 [36,37].It is followed by the silicon microstrip tracker (SCT), which usually provides eight measurements per track.These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to || = 2.0.The TRT also provides electron-identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range || < 4.9.Within the region || < 3.2, electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering || < 1.8 to correct for energy loss in material upstream of the calorimeters.Hadron calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within || < 1.7, and two copper/LAr hadron endcap calorimeters.The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements respectively.
The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroidal magnets.The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector.Three layers of precision chambers, each consisting of layers of monitored drift tubes, covers the region || < 2.7, 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the -axis along the beam pipe.The -axis points from the IP to the center of the LHC ring, and the -axis points upward.Cylindrical coordinates (, ) are used in the transverse plane,  being the azimuthal angle around the beam pipe.The pseudorapidity is defined in terms of the polar angle  as  = − ln tan(/2).Angular distance is measured in units of complemented by cathode-strip chambers in the forward region, where the background is highest.The muon trigger system covers the range || < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
A two-level trigger system is used to select events for further analysis [38,39].The first-level trigger is implemented in hardware and utilizes partial detector information to accept events at a rate of 100 kHz.The high-level trigger is based on software and reduces the rate of accepted events to 1 kHz.An extensive software suite [40] is used in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data-acquisition systems of the experiment.

Data and MC event samples
The data used in this measurement were collected during the LHC 2015 and 2016  -collision runs at √  = 13 TeV, corresponding to an integrated luminosity of 36.1 fb −1 .The uncertainty in the combined 2015-2016 integrated luminosity is 2.1% [41], obtained using the LUCID-2 detector [42] for the primary luminosity measurements.Analyzed events were required to have all ATLAS subdetectors fully operational, and stable beam conditions.MC-simulated event samples were used in this analysis to estimate the contamination from background processes, correct the data from reconstruction level to particle level (unfolding), and for comparisons to the unfolded data.Four processes are considered in detail for this analysis: production of +jets, +jets,  t, and electroweak diboson events.Single top-quark production was shown to make a negligible contribution, as its acceptance is similar to the already small +jets contribution, and its cross-section is orders of magnitude smaller.QCD multĳet events were shown to be negligible using a data-driven method.
The +jets signal and +jets background were simulated at next-to-leading-order (NLO) accuracy with the Sherpa 2.2.1 [43,44] MC generator, matching additional hard parton emissions [45] to the parton shower algorithm based on the Catani-Seymour dipole formalism [46].The MEPS@NLO prescription [47][48][49][50] was used with a merging threshold of 20 GeV to provide merged matrix-element and parton-shower calculations accurate at NLO in QCD for up to two additional partons and accurate at LO for up to four additional partons; the virtual QCD matrix-element components at NLO accuracy were provided by the OpenLoops library [51,52].In this configuration,  →  b splittings can originate either from the matrix element or from the parton shower, depending on the transverse scale of the splitting relative to the CKKW merging scale [48,53].As the matrix-element portions of this calculation include  →  b splittings, it is expected that the majority of boosted jets in this sample will have been initiated by matrix-element rather than parton-shower splittings, but that the  b splitting within them can arise from either matrix-element or parton-shower modeling.The 5FNS NNPDF3.0nnloPDF set [54] with  s (  ) = 0.118 was used, in conjunction with the Sherpa authors' standard set of tuned MC parameter values, referred to as the "tune".The samples were normalized to the NNLO inclusive / cross-sections [55].
An alternative LO QCD +jets sample was simulated using MadGraph 2.2.2 [56] with up to four additional partons at matrix-element level, and using the NNPDF2.3loset of PDFs [57].This was interfaced with Pythia 8.186 [58] for modeling of the parton shower and underlying event, with use of the CKKW-L merging procedure [59,60], and with bottom and charm hadron decays corrected by EvtGen 1.2.0 [61].The A14 tune [62] and the 5FNS NNPDF2.3loPDF set [57] with  s = 0.13 were used by Pythia 8.This sample was also normalized to the NNLO inclusive cross-section for use in MC-based background estimation.
The  t background was simulated using the Powheg-Box v2 HVQ [63][64][65][66] generator at NLO with the CT10 PDF [67], and matched to the Pythia 8.186 [58] parton shower and hadronization with the A14 tune [62].The top-quark mass was set to 172.5 GeV, and the ℎ damp parameter -which controls the  T of the first additional emission beyond the Born configuration -was set to the mass of the top quark.This sample was normalized to the  t NNLO+NNLL cross-section [68].
The diboson processes (,  , and  , with one of the bosons decaying hadronically and the other leptonically) were simulated using Sherpa 2.1.1 with a MEPS@NLO configuration similar to that used in the +jets and +jets processes described above.The CT10nlo PDF set [69] with  s = 0.118 was used, with the corresponding Sherpa parton-shower tune.
All these MC event samples were processed through the ATLAS Geant44-based detector simulation [70,71] and digitization system to produce inputs to object reconstruction equivalent to those from the detector data-stream in collision events.Pile-up -multiple   collisions in each hard-interaction bunch-crossing, as well as detector-response effects due to surrounding bunch-crossings -was emulated by pre-digitization overlay of simulated detector hits from multiple Pythia 8.186 inclusive QCD events using the A3 tune [72].The composite events were reweighted in the analysis so the distribution of the number of overlays per simulated signal event matched the mean number of collisions per bunch-crossing, ⟨⟩, in data.
In addition to these MC generator versions and configurations, signal samples were also produced only at particle level, using the newer Sherpa 2.2.10 generator in a configuration equivalent to that described above, as well as 4FNS and fusing variations, and the NLO MadGraph5_aMC@NLO 2.7.3 + Pythia 8.244 generator with FxFx merging in 5FNS and 4FNS modes.These variations are used in Section 9 for comparisons with unfolded observables in data.

Lepton and jet definitions
The objects used in this analysis to select events and define observables are charged leptons, large- jets, and (optionally -tagged) small- track-jets.These are defined in this section, with detailed discussion of the systematic uncertainties associated with their reconstruction postponed to Section 8.
The final results of this analysis consist of observables measured in particle-level fiducial volumes, which closely match the reconstruction-level event-and object-selection, to minimize model-dependent extrapolations.In what follows, the physics objects are defined at both reconstruction level and particle level.Throughout this analysis, stable particles are defined to be those with a mean lifetime  > 10 mm/.

Charged leptons:
The leptonically decaying  boson is identified by use of high- T charged  ± and  ± pairs, including those from -lepton decays.
At reconstruction level, identified electrons and muons were used, with contamination suppressed by use of "tight" and "medium" identification criteria for electrons and muons respectively [73,74].The lepton candidates were geometrically restricted to the active regions of the calorimeters and muon spectometer (|| < 2.47 excluding the 1.37-1.52region for electrons, and || < 2.47 for muons), and have  T > 27 GeV for both lepton flavors.Both the electrons and muons were required to be isolated from significant energy deposits in the calorimeter and from high-momentum tracks.
Corrections derived in  → ℓℓ events were applied to account for differences in reconstruction and identification efficiencies between data and simulated events.Electron energies as measured by the electromagnetic calorimeter were calibrated to the electron energy in simulation, and discrepancies between data and simulation corrected [75].The reconstructed muon momenta were similarly calibrated using a mix of simulation and data-driven methods [76].Uncertainties associated with the lepton efficiencies, scales, and resolutions were propagated via systematic variations.
At particle level, the charged leptons are defined as final-state electrons and muons dressed with direct photons within a surrounding cone of size Δ = 0.1, with kinematic requirements of || < 2.47 and  T > 27 GeV on the resulting objects.No explicit requirements of final-state isolation nor direct connection to the hard scattering are made in the fiducial lepton definition, hence events with two high- T electrons or two muons in the final-state acceptance are treated as part of the signal even if one or both arise from -lepton or heavy-flavor hadron decays.

Large-𝑹 jets:
A large- jet is required in this analysis as a proxy for a high-momentum, hadronically decaying or splitting object, e.g. a high-energy gluon.
In reconstructed events, large- jets were reconstructed from calibrated topological clusters of calorimeter cells [77], using the anti-  algorithm with radius parameter  = 1.0 [78,79].The clustered jet's energy and pseudorapidity were further calibrated using simulated data, and its mass was calibrated using a combination of calorimeter and tracking information [80,81].Pile-up and underlying-event contributions to the jets were suppressed by a dynamic trimming [82] procedure discarding clusters from  = 0.2 subjets with less than 5% of the original jet  T .The trimmed jets were required to have  T > 200 GeV and || < 2, to ensure that the majority of the jet lies within the tracker volume.Discrepancies between data and simulation in the jet calibration were treated as systematic uncertainties.
At particle level, all final-state particles are used as inputs to the anti-   = 1.0 jet algorithm, and trimming is applied with the same parameters as in reconstruction, effectively subtracting underlying-event contributions.Again the trimmed jets are required to have  T > 200 GeV and || < 2.
Subjets and -tagging: Small- subjets within the large- jet are used as proxies for the leading partons in the jet, e.g. the -quarks in a high-energy  →  b splitting.To achieve high angular resolution, inner-detector tracks are used in place of calorimeter information to construct these subjets.
At reconstruction level, the anti-  algorithm with radius parameter  = 0.2 was used to construct track-jets from at least two inner-detector tracks matched to the primary vertex (see Section 6).The track-jets were required to have  T > 10 GeV and || < 2.5.Identification of the track-jets as likely (or not) to have been initiated by -quarks was provided by the ATLAS MV2c10 multivariate -tagging algorithm [83] trained on leptonic  t events to achieve 70% -tagging efficiency, with mis-tag rejection ratios of 7.1 and 120 for charmed and light jets respectively.Corrections were applied to compensate for differences in -tagging efficiency and charm-and light-jet mis-tag rates observed between simulation and collision data.Systematic uncertainties from tracking, vertexing, and -tagging calibration were evaluated.
At particle level, stable charged particles are used as the inputs to the  = 0.2 anti-  jet algorithm, again with  T > 10 GeV and || < 2.5.A charged-particle jet is considered to be -tagged if a weakly decaying -hadron, with  T > 5 GeV, is associated with it by the ghost-association method [84].
Any small- track-jets or small- charged-particle jets matched to the large- jets by ghostassociation are considered to be charged subjets of the large- jet.A large- jet is considered -tagged if any of its small- subjets is -tagged, and the number of -tagged subjets is used to define subclasses of signal events.
A simple overlap-removal procedure was used at both reconstruction and particle levels to accommodate a single particle or object leaving multiple signatures in the detector.In particular, this procedure was motivated by the possibility that leptons from the -boson decay could also be recorded as an additional large- jet.To correct for this, the angular separation between each lepton and large- jet was computed, and if any were within Δ = 1.0 of each other then the jet was removed.This cut has the additional effect of suppressing contributions from the dĳet process with collinear -boson emission from quarks, while still admitting widely separated topologies in which the  boson and the large- jet are located in the same event hemisphere.

Event selection and observables
For both the data and reconstruction-level MC samples, single-electron and single-muon triggers [85][86][87] were used to select the subset of events of a priori relevance to this analysis before application of offline event-selection cuts.The kinematic requirements on the leptons defined in the previous section ensured full trigger-efficiency for events in the analysis fiducial phase-space.Candidate events were required to have a primary vertex, defined as the vertex with the highest sum of track- 2 T , and with at least two associated tracks of  T > 400 MeV.
Event selection cuts on the physics objects defined in Section 5 were applied equally to reconstruction-level events in data and MC simulation, and to particle-level MC events.To select signal events containing a leptonically decaying -boson candidate produced in association with a large- jet, events were required to have exactly two charged leptons of the same flavor, ℓ ± ∈ { ± ,  ± }, and at least one large- jet.No opposite-charge requirement was placed on the charged-lepton pairs, but the invariant mass of the lepton pair,  ℓℓ , was required to be greater than 50 GeV to exclude the photon-dominated part of the Drell-Yan continuum.
This set of criteria defines the "inclusive" event-selection region for this analysis.A more exclusive "2-tag" region was defined as a subset of this by additionally requiring that the large- jet contains exactly two -tagged subjets.The numbers of events selected in these regions, for the dielectron and dimuon channels separately, and for both collision data and MC simulation, are shown in Table 1, omitting the single-top and multĳet processes, which were shown to be negligible by MC studies and a data-driven background study respectively.
The large- jet used to construct the observables below was chosen as the highest- T large- jet in the inclusive selection, and the highest- T 2-tag large- jet in the 2-tag selection.The 2-tag observables are hence not a subset of the inclusive observables, as the latter includes events in which the highest- T large- jet does not contain two -tagged subjets.
The differential observables measured in this analysis are: Properties of the large- jet (J): the large- jet mass,  J , and transverse momentum,  J T , for both the inclusive and 2-tag regions; Properties of the large- jet and -boson system (+J): the transverse momentum of their vector sum,  +J T , and their separation in , Δ(, J), for the inclusive region only; Table 1: Reconstruction-level event-selection yields (and statistical uncertainties of the expected yields) in the  and  channels from each process's MC sample (with Sherpa 2.2.1 used for the +jets samples) with the normalizations discussed in Section 4, and from collision data.The + q categories were defined using a particle-level filtering strategy in which the given orthogonal combinations of heavy hadrons (from any source) with  T > 5 GeV, || < 2.9 and associated to a  T > 10 GeV truth-particle jet, were enforced in separate MC event samples at during event generation.The single-top process was found to make a negligible contribution to all event selections and has been omitted.Multĳet backgrounds were estimated to be negligible by a data-driven method.Subjet separation: the angular separation, Δ(, b), between the two -tagged subjets in the 2-tag region.
These variables respectively measure the external and internal kinematics of the selected large- jet, the effect of additional QCD radiation on the +J event topology, and the kinematics of  →  b splitting in the boosted regime.The total cross-sections in the fiducial volume for the inclusive and 2-tag event selections were also measured, via integration of the differential measurements.
A selection of these observables are shown at reconstruction level in Figure 1, illustrating the inclusive large- jet and +J  T distributions, and the 2-tag large- jet mass and Δ(, b) distributions in data and simulation.One  and one  dilepton observable are shown for each event-selection region: all observables were constructed separately for the electron and muon channels, to allow consistency checks and consideration of distinct detector effects before lepton-channel combination.Significant discrepancies between data and the sum of MC-modeled processes (including correction factors) are visible in several variables, particularly the inclusive large- jet and +J-system transverse momenta.This evidence of general mismodeling in the boosted phase-space at reconstruction-level motivated the publication of the detector-corrected forms of these observables.
These plots show the admixture of processes contributing to the two event-selection regions, with the inclusive event-selection dominated by the +light-parton (including +charm quark) and  t production processes, and the 2-tag region dominated by + b with significant background contributions from diboson and  t production.

Correction of observables to particle level
The main result of this analysis is the set of differential cross-sections introduced in the previous section, corrected to the particle-level fiducial phase-space by unfolding: the deconvolution of biases introduced by the detector and reconstruction algorithms.Presentation in the fiducial form assists comparison with results from different experiments and with theoretical predictions.
In this analysis, the unfolding was performed using the Fully Bayesian Unfolding (FBU) technique [88].FBU directly performs a likelihood fit in the parameter space of signal cross-sections , plus a set of nuisance parameters  that control background compositions and systematic uncertainties (to be described in Section 8).The FBU method hence gives access to a full posterior probability density in the space of signal and nuisance parameters, from which arbitrarily detailed correlation information may be extracted.
The FBU posterior probability for each observable is constructed as the product of Poisson probabilities over all reconstruction-level bins  as a function of the model parameters  = {, }, where  is the set of observed bin counts in data, and () is a set of prior probability densities over the model parameters.The L term in this can be expressed as a product of Poisson likelihoods over the bins, where () is the set of expected total bin yields.This can be decomposed further into reconstruction-level background and signal cross-sections   and   , where L is the integrated luminosity of the dataset and  P→R   is the response matrix (| ) mapping particle-level bins {  } to reconstruction-level bins {}.Overflows outside the fiducial acceptance are included in the bin indexing, so migrations into and out of the acceptance are treated with the full machinery.This provides the full formalism necessary to relate the observed data   to our parameters of interest, the particle-level signal cross-sections   .In the inclusive region, the signal is defined as all +jets contributions, while in the 2-tag region it is only + b, with other +jets flavors now considered as part of the background.
The background cross-sections and response matrix were constructed from a set of MC-derived histogram templates, including predictions from the nominal MC samples (Sherpa +jets) with data/MC corrections, and a set of predictions from each systematic-uncertainty variation to be described in Section 8. Examples of nominal-model response-matrix templates for two observables are shown in Figure 2.
The unit nuisance parameters   ∈  were used to define linear interpolations of   and  P→R   between templates corresponding to   = 0 and 1.The absolute deviations obtained from "up" and "down" variations of nuisance parameters were averaged into single positive deviations for use in this symmetrized form.Unit Gaussian priors were applied to all   other than the luminosity uncertainty; as negative luminosities would imply unphysical negative event rates, the luminosity uncertainty was modelled by an always-positive log-normal prior with  = 0 and  = 0.021.The background normalizations were allowed  to float with Gaussian prior widths discussed in Section 8, and flat, non-negative priors were imposed on the signal cross-section parameters .
For this analysis, the "hunfold" [89] implementation of FBU was used.This uses gradient ascent to maximize the posterior log-probability ln P (), and then samples the posterior probability distribution using a proposal density in  derived from the likelihood Hessian matrix at the maximum-likelihood point.In the unfolded observables of this analysis (see Section 9), each variable is constructed from a sum of electron and muon channels via a single FBU fit, with a "double width" concatenation of electron and muon response matrices used to simultaneously unfold the electron and muon distributions into combined particle-level distributions representing -boson decays into either lepton flavor.

Systematic uncertainties
The measurements in this analysis are affected by statistical uncertainties and by systematic uncertainties from detector-interaction and reconstruction processes, from MC modeling, and from the unfolding procedure.Estimates of these uncertainties were derived using standard methods described in this section, and were propagated through the unfolding procedure where they affected the final posterior distributions.
The main sources of experimental systematic uncertainty affecting these measurements were: Charged leptons: energy/momentum scale and resolution, and reconstruction, identification, isolation, and trigger efficiencies.Systematic variations of the data/MC efficiency corrections and energy/momentum calibrations applied to the MC samples [75,90] were used to define variations from their nominal templates in the FBU unfolding's parameterized background cross-sections and signal response matrices.
Large- jets: energy scale, mass scale, energy resolution, and mass resolution.The jet energy scale (JES) and jet mass scale (JMS) uncertainties were based on the double-ratio of each variable (energy, mass) to its equivalent reconstructed from track-jets, between data and simulation; this construction permitted separation of the physics effects from the calorimeter reconstruction systematic uncertainties [91].
In this analysis, the JES and JMS uncertainties have been treated as fully uncorrelated; cross-checks assuming higher degrees of correlation had negligible effect.The uncertainty in the jet mass resolution (JMR) was determined by smearing the jet mass such that its resolution was degraded by 20%, and for the large- jet energy resolution (JER), symmetric variations of the jet energies by ±2% were applied.
Flavor tagging: track-jet -tagging efficiencies, and mis-tag rates for -jets and light-flavor jets.The -tagging efficiency and charm mis-tag rate in simulation were calibrated using the tag-and-probe method in  t events [92,93], and the light-jet mis-tag rate was calibrated in dĳet events [94].A total of 25 diagonalized systematic uncertainties associated with these calibration factors were considered in this analysis.
Pile-up: pile-up reweighting uncertainty.The MC predictions were reweighted such that their distribution of the number of pile-up vertices matched the pile-up distribution measured in the data.The uncertainty from this procedure was propagated to the unfolding using variations of pile-up weights to account for the uncertainty in the pile-up estimation.
In addition to the above measures of imperfect understanding of the detector, uncertainties in the modeling of both the signal and background physics processes were propagated to the final measurement, via systematic variations in the unfolding components.As for the detector uncertainties, this propagation was implemented via linear interpolation of template distributions (including response matrices) in the unfolding machinery.For the signal process, standard 7-point variations of the renormalization and factorization scales by factors of two from the nominal values, the nominal PDF's error-set, differences between the nominal PDF and the alternative CT14nnlo [95] and MMHT2014nnlo [96] central PDFs, and variations of  s (  ) by ±0.001 were included.Approximate uncertainties in the matching procedure between matrix element and parton shower were evaluated by variations between the nominal Sherpa samples and the MadGraph5_aMC@NLO + Pythia 8 samples.The envelope of differences in the signal-process response matrix between Sherpa and MadGraph was also treated as a systematic uncertainty; a cross-check showed that the results were not sensitive to whether or not the Sherpa/MadGraph difference was split into several independently parametrized components.
Dedicated variation MC samples were used to evaluate the modeling of the  t background, considering the matrix element, parton shower model, and the dependence on initial-and final-state radiation settings and the ℎ damp parameter.In addition, the quality of  t modeling was assessed in MC-data comparisons using an opposite-flavor  variation on the standard / selection; additional mismodeling uncertainties of 30-50% were added to the first bins of the inclusive  +J T observable, to cover an MC-data discrepancy in this control region.A conservative  = 0.2 Gaussian prior, informed by the maximum reconstruction-level data/MC disagreement, was used for the normalization uncertainty of all background processes other than the non-signal +jets samples in the 2-tag region, for which a larger  = 0.5 prior was used.This inflated uncertainty was assigned to reflect that a robust in situ measure of the 2-tag flavor fractions could not be obtained from -tagging variable templates due to the low event-count in this analysis phase-space.
In addition, systematic errors arise from the finite sizes of the simulation samples.In principle, the statistical limitation of each bin corresponds to a nuisance parameter in the unfolding.But in practice many of these parameters have little effect, since most bins with low rates, e.g.very statistically limited off-diagonal elements of the response matrix, by construction do not contribute much to the result.This abundance of uncertain quantities creates an intractably large space of nuisance parameters in which the unfolding fit is unlikely to converge.A "pruning" procedure was hence implemented, both for statistical uncertainties in response-matrix estimates and for all the detector and modeling systematic uncertainties described.The pruning criterion was to remove nuisance parameters which produced a background variation of less than 5% in all bins (the background fractions being around 20% of the event yield in the inclusive region and 40% in the 2-tag region), and did not change any entries in the response matrix by at least 0.002.The effect was to prune jet-substructure and most lepton-calibration systematic uncertainties for all observables, jet-mass systematic uncertainties for the  J T and  +J T observables, and jet-mass resolution for all but the large- jet mass observable.The -tagging uncertainties were partially pruned for the 2-tag region; naturally, there are no such uncertainties in the inclusive region.The unpruned MC statistical uncertainties in total contributed a subleading ∼1%.
A comprehensive set of closure tests in the unfolding / nuisance-profiling procedure was performed, including closure tests with and without reweighting of the MC samples to match the reconstruction-level data distribution in each variable, stress-testing by reweighting Sherpa pseudodata with a MadGraphderived response matrix and vice versa, and checks against bias from nonuniform signal priors.Non-closure effects from these tests, which were very small in nearly all bins, but rose to around 20% in single bins of the  J T and  +J T distributions, were added in quadrature to the mix of final uncertainties.
Summary systematic uncertainties are listed in Table 2 and illustrated for two observables in Figure 3.The summaries were obtained from the FBU posterior-distribution samples by computing the sample covariance matrix between all fit parameters, including nuisance parameters and signal bin-values, cov   = ⟨    ⟩ − ⟨  ⟩⟨  ⟩ for parameter indices  and .The absolute values of covariance-matrix rows for the elementary nuisance parameters were then summed in semantic groupings, e.g. the sets of nuisance parameters for electron reconstruction, jet reconstruction, -tagging, etc.The resulting grouped-covariance entries were projected onto the signal-bin cross-section parameters via the relation Table 2: Summary table of relative uncertainty magnitudes per observable.These uncorrelated estimates of systematic uncertainties' contributions to the total uncertainty are based on projection of nuisance parameters onto the signal cross-section bin values via the likelihood-scan covariance matrix, and summing the elementary contributions in quadrature.
Inclusive 2-tag  where  is the signal bin index, and  the systematic nuisance index.By construction, these grouped uncertainties are symmetric.The total uncertainty, including statistical effects, is given by the larger standard deviation of each signal bin value.The largest systematic-group effects are from large- jet calibration and signal modeling in the inclusive region, and from background normalization and -tagging calibration in the 2-tag region.

Results
The posterior distributions of nuisance parameters common to the electron and muon channels were unfolded independently and found to be consistent before performing the simultaneous unfolding shown here.Other than in the inclusive-region  J distribution, the FBU procedure did not constrain most systematic-uncertainty nuisance parameters significantly.The nuisance parameter for the Sherpa vs. MadGraph modeling uncertainty was the exception: this was constrained in favor of Sherpa to 20-40% of the original prior width by all distributions.The background normalizations changed by at most a few percent, not significantly modifying the admixture of signal and background predicted by the MC programs, and least of all in the 2-tag fits.
The full multidimensional posterior-probability distribution is the most complete form of the measurement, but for histogram presentation each bin's marginal probability distribution is used to define the central value and error bar; these correspond to the marginal median and marginal central 68% confidence range respectively.
The final unfolded differential cross-sections as functions of the event kinematics are presented in Figures 4 and 5, compared with NLO particle-level predictions from Sherpa 2.2.1 and 2.2.10, and MadGraph5_aMC@NLO 2.7.3 + Pythia 8.244.In this section, all predictions are normalized to their  The combined statistical and systematic uncertainty band from the FBU fit is shown.In the legend, "MGaMC" refers to NLO configurations of the MadGraph5_aMC@NLO generator, and "MG" to LO MadGraph, both run in conjunction with Pythia 8.All models are using the 5FNS.own calculated cross-section to allow an unbiased comparison of both the total rates and distribution shapes between the different generators.
The total fiducial cross-sections are measured by integration of the angular distributions (chosen because they do not have overflow bins) in both event-selection regions.The measured values are  incl = 2.37 ± 0.28 pb for the inclusive selection, and  2-tag = 14.6 ± 4.6 fb for the 2-tag selection.
The NLO Sherpa 2.2.10 and NLO MadGraph5_aMC@NLO + Pythia 8 generators predict slightly higher central values of the inclusive cross-section than the measured one, at 2.53 ± 1.25 pb and 2.68 ± 0.67 pb respectively, while the LO MadGraph central configuration overestimates at 2.84 pb.The older Sherpa 2.2.1 central prediction of 2.37 pb also agrees closely with the measurement.The large uncertainties in the NLO calculations are dominated by the effects of scale variations; systematic uncertainties are not well defined for the LO calculation, but the better nominal performance is evidently provided by the NLO generators.For the 2-tag cross-section, the NLO 5-flavor Sherpa 2.2.10 and MadGraph5_aMC@NLO central predictions describe the data well, with 14.9 ± 4.2 fb and 14.4 ± 1.9 fb respectively.The Sherpa fusing cross-section, mixing elements of the 4-and 5-flavor calculations, is close to the 5-flavor predictions, with 14.3 ± 4.8 fb.The 4-flavor Sherpa and MadGraph5_aMC@NLO calculations, and the previous 5-flavor Sherpa 2.2.1 prediction underestimate the 2-tag cross-section with 9.4 ± 3.1 fb, 4.4 ± 1.1 fb, and 9.1 fb respectively.This result underscores the expectation that the 5-flavor (or fusing) scheme is the more appropriate choice for heavy-quark production in this analysis phase-space, even for  b-pair production.
The  2-tag / incl ratio of 2-tag to inclusive events seen in data is (0.62±0.12)%, accounting for cancellations of shared systematic uncertainties between the inclusive and 2-tag cross-section estimates.This figure is reproduced well by Sherpa 2.2.10's (0.59 ± 0.39)% and by MadGraph5_aMC@NLO + Pythia 8 with (0.54 ± 0.21)%.The older NLO Sherpa 2.2.1 and leading-order MadGraph + Pythia 8 estimates undershoot with 0.42% and 0.38% respectively.These cross-sections hence furnish new experimental discriminators between perturbative-QCD models of high- T heavy-flavor production rates, despite the significant measurement uncertainties.
In the inclusive-selection differential distributions of Figure 4, the MadGraph5_aMC@NLO + Pythia 8 predictions can be seen to have the shapes in best agreement with data, not suffering from the excesses of activity common to the Sherpa models and leading-order MadGraph + Pythia 8 in the more extreme phase-space regions of high  J T and  +J T , and small Δ(, J).In this topology, where the +jets process becomes more like a dĳet system with collinear -boson radiation, both generators display similar shape deviations with respect to the measurement, with the best agreement at low  T , low mass, and low levels of additional event activity (as characterised by low  +J T values and the back-to-back Δ(, J) ∼  region).The excess in inclusive cross-section estimate for nominal MadGraph5_aMC@NLO +Pythia 8 can be seen to arise from a relatively small overpopulation with respect to Sherpa in the most populated bins of  J T and  +J T , while its shapes typically match data to within 10%, whereas the other generators overestimate high-scale activity by 50-100%.
As noted in the review of reconstruction-level plots, the mismodeling of extra radiation by Sherpa and leading-order MadGraph + Pythia 8 is one of the most significant discrepancies between simulation and data observed in this analysis.Despite the large measurement uncertainties, this evidence of larger transverse recoil of the +J system in simulation than in data, as well as the higher-than-observed  T and mass of the large- jets, is an important result for inclusive QCD model development and tuning in this boosted phase-space.(left) and mass (right), and the bottom row shows the angular separation of -tagged charged-particle subjets.The combined statistical and systematic uncertainty band from the FBU fit is shown.In the legend, "MGaMC" refers to NLO configurations of the MadGraph5_aMC@NLO generator, run in conjunction with Pythia 8, and "4/5F" refer to the flavor-number scheme used.
In the 2-tag selection distributions in Figure 5, the larger uncertainties and much lower event counts mean that shape discrepancies are more difficult to discern: current shape modeling appears to be performing adequately, with relatively constant MC/data ratios for large- jet mass and Δ(, b).In particular, the large- jet mass in this region appears to be consistently well described by all MC models, with no sign of the excesses and model disagreements seen in the inclusive-region version of that observable.Further analysis with the complete Run 2 dataset will be required to discriminate between the models in this phase-space, beyond the evident favoring of the 5FNS for total  b production rate.

Conclusion
We have presented measurements of cross-sections for the production of a leptonically decaying  boson in association with a large-radius jet in LHC 13 TeV proton-proton collision events from the ATLAS 36 fb −1 combined 2015-2016 data-taking run, corrected to a particle-level fiducial region.The observables presented are differential in kinematic variables of the  boson, the large- jet, and its associated small- -tagged charged-particle jets.They are measured with a flavor-inclusive event selection and also within a "2-tag" event-selection region which adds a double -hadron labeling requirement on the large- jet.The integrated cross-sections within the fiducial volumes of the event-selection regions have also been presented.
These cross-section estimates were extracted from data using the Fully Bayesian Unfolding formalism, effectively performing a posterior-probability fit over a combination of signal and background cross-section parameters, and various systematic uncertainties affecting the response of the detector.These measurements provide an important test of perturbative quantum chromodynamics, with particular emphasis on the production rates and kinematics of bottom quarks.These are a significant background to several important Higgs-boson searches, and are affected by significant theory and modeling uncertainties.The full data, correlations, and samples from the posterior-probability function are provided for use in event-generator tuning and model hypothesis-testing via public databases.
The differential cross-sections indicate significant mismodeling of QCD activity in the inclusive event selection by many MC models, with both the NLO Sherpa and LO MadGraph + Pythia 8 event generators predicting greater  T and azimuthal decorrelation in the +J system than seen in the ATLAS data.The large- jet itself is consequently biased to higher  T and mass values than in data, although to a lesser extent than the deviations in the +J-system observables.The NLO MadGraph5_aMC@NLO + Pythia 8 model, by contrast, describes all distribution shapes well, with only a small overestimate of the inclusive fiducial cross-section.All models somewhat overestimate this cross-section, with recent Sherpa versions providing the best description.
The 2-tag selection, while its discrimination power is limited by the number of data events, does not appear to suffer from the same shape-modeling issues, and there is good shape agreement between the data and all MC models.The strongest feature observed in this event-selection region is in normalization, with models using the 4FNS approach significantly underestimating the rate of  b boosted-jet production.Five-flavor approaches with modern tools do much better, with both Sherpa 2.2.10 and MadGraph5_aMC@NLO providing accurate predictions for the 2-tag cross-section and the ratio of 2-tag to inclusive rates.This information is important for future use of MC-derived large- jet flavor composition in, for example, studies of the  ( b) process.As the result is statistically limited in the 2-tag region, the significant increase in integrated luminosity from the LHC full Run 2 dataset (and expected from the LHC Run 3 program) should provide a clearer view of how far the validity of MC modeling of heavy-flavor extends into this extreme event topology.

Figure 1 :
Figure 1: Selected reconstruction-level observables, compared with pre-fit MC simulation with Sherpa 2.2.1 used for the +jets samples: the top row shows the inclusive-selection  large- jet  T distribution (left) and  +J  T distribution (right), and the bottom row shows the 2-tag selection  large- jet mass and  Δ(, b) distributions.The MC statistical uncertainties are shown by the dark gray band and the total uncertainty, including in quadrature the systematic uncertainties detailed in Section 8, are shown by the light gray band.The statistical uncertainty of the data is given by the error bar on the data point.

Figure 3 :
Figure 3: Illustration of leading post-unfolding groups of systematic uncertainties for the inclusive-selection large- jet  T (left) and 2-tag selection Δ(, b) (right).These groups have been constructed from elementary systematic nuisance parameters, assuming statistical independence of error sources within each group.

Figure 4 :
Figure4: Particle-level differential cross-sections in the inclusive event selection.The top row shows the large- jet  T (left) and mass (right), and the bottom row shows the  T of the +J system (left) and the azimuthal separation of the  and large- jet (right).The combined statistical and systematic uncertainty band from the FBU fit is shown.In the legend, "MGaMC" refers to NLO configurations of the MadGraph5_aMC@NLO generator, and "MG" to LO MadGraph, both run in conjunction with Pythia 8.All models are using the 5FNS.

Figure 5 :
Figure5: Particle-level differential cross-sections in the 2-tag event selection.The top row shows the large- jet  T (left) and mass (right), and the bottom row shows the angular separation of -tagged charged-particle subjets.The combined statistical and systematic uncertainty band from the FBU fit is shown.In the legend, "MGaMC" refers to NLO configurations of the MadGraph5_aMC@NLO generator, run in conjunction with Pythia 8, and "4/5F" refer to the flavor-number scheme used.