Multi-jet merging in a variable flavor number scheme

We propose a novel technique for the combination of multi-jet merged simulations in the five-flavor scheme with calculations for the production of b-quark associated final states in the four-flavor scheme. We show the equivalence of our algorithm to the FONLL method at the fixed-order and logarithmic accuracy inherent to the matrix-element and parton-shower simulation employed in the multi-jet merging. As a first application we discuss Zbb production at the Large Hadron Collider.


I. INTRODUCTION
Measurements involving heavy-flavor (HF) production are a vital component of the physics program at the Large Hadron Collider (LHC). With the Higgs boson decaying predominantly into b quarks, some of the recent efforts in the LHC experiments have focused on this decay mode in Higgs production in association with vector bosons [1,2] and in association with top quarks [3,4]. Furthermore, searches for physics beyond the Standard Model (SM) also often rely on heavy flavor final states because the couplings to third generation fermions are sometimes assumed to be enhanced in new physics models. While the modeling of signal processes is relevant, it is of even higher importance to have a precise prediction for the dominant SM backgrounds including heavy flavor production. This is reflected for example by the large efforts spent in the LHC Higgs Cross Section Working Group [5] to understand the modeling of the ttbb background to ttH(bb).
Including heavy-quark mass effects in the QCD evolution, e.g. in fits of parton distribution functions (PDFs), is a prerequisite. The fixed flavor number scheme (FFNS) as the simplest approach assumes a fixed number of active quarks which can be varied explicitly [6]. General-mass variable flavor number schemes (VFNS) like ACOT [7][8][9][10], TR [11,12], and FONLL [13,14] on the other hand account for mass effects dynamically above the corresponding thresholds. Hybrid schemes have also been devised [15], and the possibility to perform the PDF evolution for massive quarks has been investigated recently [16]. Accordingly, higher-order corrections to the production of a final state like Hbb or Zbb have been computed for a fixed number of flavors [17][18][19][20][21], and in variable flavor number schemes [22][23][24][25][26][27]. Progress in clarifying the interplay between different schemes was reviewed in [28], and the effect of higher-order electroweak corrections was investigated recently [29].
For experimental analyses a realistic simulation of collision events at the hadron level is crucial. Such simulations are provided by all modern Monte Carlo event generators [30] like Herwig7 [31], Pythia8 [32] and Sherpa [33]. The simulation of heavy-flavor production employed in these programs varies both in method and in accuracy, and a formal comparison to the methods used in analytical calculations is missing so far. Typical event generator setups employ matrix elements in the five-flavor scheme (5FS), where the b-quark is treated as massless and included also as an initial state parton. 1 Before the parton shower is simulated, b-quark masses are restored by means of a kinematics reshuffling, and subsequently massive splitting kernels and kinematics are used in the parton shower, see e.g. [34][35][36][37][38]. This procedure is necessary in particular to avoid an excess in the g → bb splitting rate as it would appear if the fragmentation process was simulated with massless b-quarks.
To increase the accuracy of heavy-flavor production Monte Carlo samples, two approaches have been studied in the literature: The matching of NLO QCD calculations in the four-flavor scheme (4FS) to parton showers using one of the common NLO+PS matching formalisms [39][40][41][42][43]. This method has been used to simulate for example the pp → V bb process class [44][45][46][47] and the pp → ttbb process [48][49][50]. An alternative option to include higher-order QCD corrections in MCs is the use of multi-leg merging methods at LO [51][52][53][54][55][56][57][58][59] or NLO [60][61][62][63][64] accuracy. Heavy-flavor production is then included automatically in the 5FS using massless matrix elements and a massive parton shower as described above. Dedicated studies of this and a comparison to the 4FS can be found in [47].
In this paper we propose a novel method that presents a hybrid between these two approaches. Embedding a massive NLO+PS calculation for pp → Zbb into a massless multi-leg merging of pp → Z+jets we retain the theoretical advantages of both methods and for the first time allow a rigorous combination of the two calculations without overlap. The latter is not merely a technical or academic point, but crucial to allow the usage of NLO-accurate heavy flavor predictions in experiments: If heavy flavor and light jet production can not be described simultaneously, it is impossible to make predictions in the presence of fake heavy flavor jets or evaluate experimental efficiencies related to them. This paper is organized as follows: After a short review of multi-jet merging in Sec. II, we describe our method from an algorithmic point of view in Sec. III. Its formal relation to the FONLL method is studied in Sec. IV. Finally, in Sec. V we demonstrate an implementation of the new method within the Sherpa event generator using pp → Zbb production as a test case.

II. MULTI-JET MERGING IN A FIXED FLAVOR NUMBER SCHEME
In the context of Monte-Carlo event generators, multi-jet merging refers to algorithms that systematically improve the accuracy of traditional LO+PS simulations by adding higher multiplicity fixed-order calculations with wellseparated parton-level jets to the simulation. Merging algorithms are constructed such as to obtain a consistent result that preserves both the logarithmic accuracy of the parton shower and the fixed order accuracy of all higherorder results. Such a treatment becomes important if kinematics and correlations between jets have to be accurately predicted. In the Sherpa event generator, multi-jet merging is implemented at leading and next-to-leading order QCD accuracy, using the MEPS@LO [57] and MEPS@NLO [60,61] method, respectively. Both methods are explained in full detail in their respective references; here we briefly summarize the main ideas that are relevant to extend them to a variable number of parton flavors.
The combination of resummation and higher-order perturbative calculations by merging involves two aspects: 1. The phase space of resolvable emissions in the resummation must be restricted to the complement of the phase space of the fixed-order calculation. For example, in the combination of pp → Z and pp → Zj, with p T,j > p T,cut , the phase space of the first emission in the resummation would be restricted to p ⊥ < p ⊥,cut . This restriction is called the jet veto, the variable used to separate the phase space is called the jet criterion, 2 and the separation scale is called the merging scale.
2. The fixed-order result must be amended by the resummed higher-order corrections in order to maintain the logarithmic accuracy in the overall calculation. This is formally relevant only if the fixed-order calculation is used in a region of phase space where resummation is both relevant and reliable, i.e. for merging scales smaller than the resummation scale. This procedure consists of (a) Re-interpreting the final-state configuration of the fixed-order calculation as having originated from a parton cascade [65]. This procedure is called clustering, and the representations of the final-state configuration in terms of parton branchings are called parton-shower histories.
(b) Choosing appropriate scales for evaluating the strong coupling in each branching of this cascade, thereby resumming higher-order corrections to soft-gluon radiation [66,67]. 3 This procedure is called α s -reweighting.
(c) Multiplying by appropriate Sudakov factors, representing the resummed unresolved real and virtual corrections [51]. This procedure is called Sudakov reweighting, and it is usually implemented using pseudoshowers [54].
The jet clustering procedure terminates when either no more combination of particles according to the QCD Feynman rules can be performed, or when the scale hierarchy would be violated by a new combination. Among all possible parton-shower histories that can be constructed, one is chosen probabilistically according to the associated weight, to represent the event. This weight is computed as the product of the weight of the irreducible core process left after the clustering has terminated, multiplied by the differential radiation probability at each of the nodes of the branching tree. Note that these probabilities depend on the parton-shower algorithm. Some representative parton-shower histories, together with different core processes, are shown in Fig. 1. Figure 1 (a) is the starting point of the jet clustering procedure in gg → Zbb configurations, which correspond to a leading-order prediction for pp → Zbb in the four-flavor scheme. The first QCD clustering that can be performed on this configuration is the combination of a final-state (anti-)quark and an initial-state gluon, as indicated in Fig. 1 (a2). The second would then be the combination of the other final-state (anti-)quark and the second initial-state gluon, leading to Fig. 1 (a3). The core process in this case is bb → Z, which corresponds to the lowest-order hard matrix element in the five-flavor FIG. 1: Representative parton-shower histories for gg → Zbb (a) and qq → Zbb (b) matrix-element configurations.
The gray blobs correspond to the irreducible core processes.
scheme. This example shows how the construction of parton-shower histories from hard matrix elements provides a natural matching of the two schemes. We will expand on this idea in Secs. III and IV. If the scale in the second clustering step, leading from Fig. 1 (a2) to Fig. 1 (a3) is lower than the scale in the first step, we speak of a violation of the scale hierarchy in the merging. In this case, the clustering procedure terminates after the first branching, and the core process is gb → Zb (gb → Zb). It is also possible that the scale in the first clustering step exceeds all subsequently defined scales, including the scale associated with the core process. In this case, it may still be possible to perform an electroweak clustering, leading to Fig. 1 (a1), and defining the core process gg → bb. The correct probability for this would be given by electroweak evolution equations [68][69][70], which are not yet implemented in standard parton showers. Therefore, one can choose to either neglect this clustering path ("exclusive clustering"), or allow it using ad-hoc clustering probabilities ("inclusive clustering"). Eventually, if no clustering can be performed due to the scale hierarchy, the core process may correspond to the starting configuration in Fig. 1 (a). Similar arguments apply to the possible parton-shower histories shown in Fig. 1 right. Again, the four-flavor scheme expression would correspond to Fig. 1 (b), and the analogue in the five-flavor scheme would be Fig. 1 (b4).

III. MULTI-JET MERGING IN A VARIABLE FLAVOR NUMBER SCHEME
In this section we describe our new algorithm, which combines a merged calculation in the five-flavor scheme with a prediction for heavy quark associated production. Both the merged and the heavy flavor prediction may be computed at leading order or at next-to-leading order QCD. The combination is achieved by means of a dedicated heavy flavor overlap removal. It acts on top of multi-jet merging algorithms and we call this technique fusing. We first explain it from a phenomenological point of view, using the example of Z+jets / Zbb production. The formal connection to the FONLL method will be established in Sec. IV.
The basic idea of the fusing approach is as follows: 1. Start with a merged simulation of the inclusive reaction, e.g. Z+jets and a calculation of heavy quark associated production, e.g. Zbb.
2. Process the Zbb simulation as if it was part of the multi-jet merged computation, i.e. apply the clustering, the α s reweighting and the Sudakov reweighting. The renormalization and factorization scales for the core process should be calculated using a custom scale definition, and the scales of all reconstructed splittings should be set to the transverse momenta in the branching [66], including higher-order corrections to soft-gluon evolution [67], cf. Sec. II. This part of the fused result will be called the direct component, as the final-state bottom quarks are generated in the fixed-order calculation.
3. Remove all final-state configurations from the five-flavor scheme merged simulation of Z+jets that have a partonshower history which can also be generated in the reweighted Zbb computation. The remainder of the five-flavor scheme result may still contribute configurations with final-state bottom quarks. This part of the fused result . . .

FIG. 2:
Example parton-shower histories for gg → Zbbg (a), and qq → Zbbg (b). Depending on the clustering path, configurations (a5) and (b3) may be identified with a Zbb topology at leading order, while configurations (a4) and (b4) may not. At next-to-leading order, all configurations can be identified with a Zbb topology.
will be called the fragmentation component. The removal of overlap between the Z+jets and the Zbb calculations is eventually achieved by both the Sudakov reweighting in step 2 and the event rejection in step 3. The application of Sudakov vetoes to Zbb restores the correct behavior of the direct component in those regions of phase space that feature a hierarchy between the hard scale and the b-quark mass. The event rejection in the Z+jets sample removes those final-state configurations which would otherwise be double counted. Algorithmically this rejection is performed as follows: 1. Create a combined evolution history, starting from the core process in the jet clustering, and ending with all final state particles produced either in the hard matrix element or by the shower.
2. Starting from the core process, find the first configuration where a bb-pair appears in the final-state.
3. If there is no such configuration, keep the event. Otherwise, count the number of additional light partons n light (quarks or gluons) in the final state of this configuration. This corresponds to the number of hard emissions before the bb-pair production according to the ordering imposed by the cluster (shower) algorithm. At leading order, discard the event if n light = 0. At next-to-leading order, discard the event if n light ≤ 1.
Typical parton-shower histories for candidate Z+jets events with gluon or quark initial states are shown in Fig. 2.
If the clustering leading to configurations (a5) / (b3) proceeds along (a3) / (b1), the scale associated with the gluon emission is the smallest in the process. The configuration can then be identified with a Zbb topology, and the event will be discarded. If the clustering proceeds along (a2) / (b2), the treatment depends on whether the fusing is performed at leading or at next-to-leading order. At leading order the configuration cannot be identified with a Zbb topology, and the event will be kept. At next-to-leading order, the configuration corresponds to a real-emission configuration, and the event will be discarded. Figure 2 (a4) / (b4) displays a parton-shower history that does not have a counterpart in the heavy-flavor result at leading order, such that the corresponding event would be kept irrespective of the scale hierarchy. At next-to-leading order, the event would be discarded. The extension to histories with more partons in the final state is straightforward and will lead to configurations that contribute to the fragmentation component also in the next-to-leading order case.
Special care has to be taken when dealing with unordered configurations. They arise when the clustering algorithm can not reconstruct a strictly k T -ordered history leading to a pp → Z core process (cf. Sec. II). In such cases, the clustering can either stop with a 2 → n core process, or continue by allowing to violate the scale hierarchy. Both variants can be used for the fusing algorithm as long as all components and all parton multiplicities are treated identically. In this work we restrict ourselves to a fully-ordered clustering algorithm.

IV. RELATION TO THE FONLL METHOD
This section will establish the relation between our merging algorithm and the FONLL method [23,24]. In the FONLL technique, the cross section of the combined event sample is generated as where σ (5) and σ (4) are the cross sections in the five-and four-flavor scheme, respectively, and σ (4),(0) is the four-flavor scheme result in the limit m b → 0. Eventually, all results should only depend on the PDFs and strong coupling in the five-flavor scheme. Formally, this is achieved by writing the cross section as The hard coefficients B ij can then be expanded in powers of the strong coupling as and are determined such that the four-flavor scheme result in terms of four-flavor PDFs is eventually recovered at the target accuracy given by the upper limit of the sum. The coefficients σ (4),(0) , needed for removal of the overlap between the fully massive and the massless calculation, can be extracted from the five-flavor scheme result [24] by expressing the b-quark PDF up to O(α s ) in terms of the fourflavor scheme light quark and gluon PDFs using the matching coefficients from [71], and subsequently re-expressing the result in terms of five-flavor scheme PDFs and α s [14]. The result is where L = ln Q 2 /m 2 b and f Σ = a=q,q f a (x, Q 2 ). We can now use the O(α n s ) five-flavor scheme partonic cross sectionŝ σ (n) to define the massless limit of the coefficient functions B (n) . In the processes of interest to us the partonic cross section is invariant under exchange of b andb. The O(α 2 s ) terms are then given by The O(α 3 s ) terms are given by The matching coefficients in Eq. (4) can be expanded in a power series in L as In the parton-shower approach, each logarithm L arises from integrating Eq. (A7) over ln t from m 2 b to Q 2 . By comparing to Eq. (4) we find that A (1) gb (z, L) = P gq (z)L. We will comment on the remaining coefficients in Sec. IV B. The non-logarithmic terms, a (2,0) (z), are needed only for matching beyond NLL accuracy and can therefore be ignored in our approach. The leading and sub-leading logarithmic terms can be derived from renormalization and collinear mass factorization of the operator matrix elements for heavy quark production [72]. They take the simple form The leading-order and relevant next-to-leading order splitting functions entering Eqs. (8) are given in App. B. The negative term in a (2,2) gb (z) and the coefficient a Σb (z) are collinear mass factorization counterterms that arise from the different number of quark flavors in the infrared and the ultraviolet regime [71,72].

A. Leading Order and Leading Logarithmic Accuracy
In order to prove that the heavy flavor overlap removal algorithm proposed in this publication amounts to a variant of the FONLL method we need to show that the removal of events from the five-flavor sample as proposed in Sec. III is equivalent to the subtraction of σ (4),(0) in the FONLL technique. We will start at leading order and leading logarithmic accuracy and comment on next-to-leading order and next-to-leading logarithmic accuracy in the next section.
The simplest configurations areσ (2) qq andσ (2) gg in Eq. (5). They correspond to removal of the double-real radiative corrections to the bb → Z process, which have a counterpart in the four-flavor scheme. Note that, in the notation of Eq. (5),σ (2) qq andσ (2) gg are integrated over the double-real radiative phase space and combined with the renormalized virtual corrections and collinear mass factorization counterterms, which renders bothσ (2) qq andσ (2) gg individually finite. In the MS scheme, the factorization scale dependent remainder combines with the PDF evolution to give the second and third expression on the right-hand side of Eq. (5). In a multi-jet merging approach, no singularities arise because we effectively use a cutoff regulator for collinear mass singularities that is defined by the jet cuts. For gg initial states we obtain where the subscript LL indicates leading logarithmic accuracy. The scales Q 1 and Q 2 denote the jet resolution in the final and next-to-final clustering of the merging algorithm, while Q cut stands for the merging scale. The Θ-functions represent the phase-space partitioning in the merging procedure with at least two jets in addition to the production of the inclusive final state in the five-flavor scheme. The last approximation is valid if the merging cut is small enough that below it we can factorizeσ (2) gg andσ (1) gb intoσ (0) bb and P gb , and if we can ignore the finite remainder of the virtual corrections included in Eq. (5). The subtraction of σ (4),(0) from σ (5) in Eq. (1) can therefore be achieved by using the algorithm in Sec. III. In the case of gg initial states it proceeds as follows: 1. Construct a parton-shower history according to the multi-jet merging procedure, perform the parton shower and add any splittings that were generated to the history 2. Starting at the core interaction identified in the merging, trace the parton-shower history. Veto the event if (a) the core process is bb → Z, followed by an initial-state g → bb and g →bb branching (b) the core process is gb → Zb (gb → Zb), followed by an initial-state g → bb (g →bb) branching (c) the core process is gg → Zbb The solution is similar for quark initial states, only the sequence of parton-shower splittings differs. The multi-jet merged expression reads qq,MEPS τ, The origin of a B. Next-to-Leading Order and Next-to-Leading Logarithmic Accuracy In order to achieve next-to-leading logarithmic accuracy according to the FONLL method, the parton shower employed in the merging must implement all coefficient functions in Eq. (8). We start with the β 0 dependent contribution to a this term can either be implemented explicitly, or absorbed into the scale choice connected to the evaluation of a (1,1) gb (z). In the latter case, the strong coupling in initial-state g → bb splittings should be computed at m 2 b . Because we use a strong coupling in the five-flavor scheme, an additional counterterm of the form α s /(2π)β 0,b L will then be required.
The coefficients a (2,2) (z) can be expressed in the parton-shower formalism as the convolution of a (1,1) gb (z) with the emission and no-emission probability of the parton shower, expanded to second order in the strong coupling and integrated over the evolution parameter from m 2 b to Q 2 . To show this, we employ the correspondence between inclusive and exclusive parton evolution summarized in App. A. Making use of Eq. (A8) and the boundary condition f b (x, m 2 b ) = 0, a single step in the parton-shower backward evolution, generating a resolved g → bb transition at scale Q 2 can be written formally as Upon expansion to O(α s ), we obtain the leading-order coefficient of Eq. (4). To reconstruct the first term in a (2,2) gb (z), we need to account for a second step in the parton-shower evolution, preceding the g → bb transition. This gives Subtracting the leading-order term in Eq. (12), and expanding to second order in the strong coupling, we can write Using Eq. (A2) and (A3), and taking the limit ε, ε → 0, gives Finally, we make use of the fact that at O(α 2 s ) there is no further dependence on t andt. Integrating them out we obtain the contribution of the first term in a (2,2) gb (z) to Eq. (4) The coefficient a Σb (z) and the final term in a (2,2) gb (z) are derived in a similar way. The difference compared to the previous case is that the hierarchy between m 2 b and Q 2 is ill-defined, because the second branching happens at smaller scales than the g → bb transition. The complete parton-shower expression for the splitting kernel in a (1,1) gb (z) and one step of the subsequent evolution reads Note that we have introduced an auxiliary scale, q 2 , in order to perform the integral over the second branching. Subtracting the leading-order term in Eq. (12), and expanding to second order in the strong coupling, we can write Using again Eq. (A2) and (A3), taking the limit ε, ε → 0, and integrating over t andt, we obtain Note that q 2 plays the role of the collinear mass factorization scale, while Q 2 corresponds to the UV renormalization scale. In the parton-shower approach, the two are strictly ordered, as the second branching cannot take place before the first one. In a fixed-order computation, we can instead set q = Q. This corresponds to treating the UV renormalization and collinear mass factorization counterterms in Eqs.
The coefficient a (2,2) qq,b (z) can be derived in a similar fashion. The difference compared to a (2,2) Σb (z) lies in the fact that the phase space for the final-state branching of the intermediate gluon into bb can be integrated out, leading to the coefficient −β 0,b . The complete parton-shower expression reads This can be expanded to second order in the strong coupling as in Eqs. (14) and (18). Taking the limit ε, ε → 0, and integrating over t andt, we obtain the contribution of a (2,2) qq,b (z) to Eq. (4) Combining the parton-shower effects leading to Eqs. (16), (20), (22) and (11), we obtain all double logarithmic coefficients in Eq. (8). The single logarithmic coefficients are not reproduced by a standard parton shower and must be implemented separately. In the case of a (2,1) Σb and a (2,1) qq,b this can be achieved using the algorithm derived in [73]. Although a complete Monte-Carlo implementation has not yet been presented for a (2,1) gb , it is clear that it can be constructed using the techniques of [73] and [74]. In the foreseeable future it will therefore be possible to achieve next-to-leading logarithmic accuracy according to the FONLL classification by using our approach.
In order to achieve next-to-leading order accuracy according to the FONLL method, we must reconstruct the coefficient functions in Eq. (6). The modification of the leading-order result, Eq. (5), by the multi-jet merging procedure has already been discussed in Sec. IV A. In complete analogy, the terms proportional toσ bb in Eq. (6) are modified by Θ(Q cut − Q 1 )Θ(Q cut − Q 2 ) in the merging, while the terms proportional toσ gb andσ qb are modified by Θ(Q 1 − Q cut )Θ(Q cut − Q 2 ). It remains to adjust the result for the fact that the four-flavor scheme coefficients in the perturbative expansion are determined using a strong coupling in the five-flavor scheme. We use Eqs. (8) and (9) of [24] to correct this mismatch. Technically this is achieved by modifying the event weight w of four-flavor scheme S-events with gg or qq initial-state as where w Born and w ME are the matrix-element weights of the Born contribution and the full S-event, respectively.

V. RESULTS
The algorithm described in Sec. III has been implemented in the Sherpa event generator in full generality, and will be investigated in the following for the example of heavy flavor production in association with a Z boson.
As argued in Section IV B, a NLO-accurate parton shower would be needed to fully match the next-to-leading logarithmic accuracy of the FONLL method. Such a shower is not available yet but recent studies indicate that the central predictions provided by it should be close to the LO result [73,74]. We thus base our new approach on the established MEPS@NLO algorithm, which we have extended to provide the necessary fully-ordered clustering and combined parton-shower history for Sherpa version 2.2.7. The event filter for the overlap removal described in Sec. III and the counter-terms described in Eq. (23) are implemented as user hooks. The event filter can either directly be used to veto events or store the veto information as alternative event weight. This allows to produce a Z+jets sample which is usable both standalone, or within a fused prediction by applying the corresponding event weight and adding a dedicated sample only for the direct component.
In the next sections we compare predictions obtained by the newly developed algorithm against existing predictions in the 5FS and the 4FS. In all of them the matrix elements are generated using Sherpa's internal matrix element generators Amegic++ [75] and Comix [76] for tree-level diagrams. Virtual diagrams are interfaced from OpenLoops 1.3.1 [77], using CutTools [78] and OneLoop [79].
In the 5FS prediction and for the fragmentation component, matrix elements are calculated for pp → + − + 0, 1, 2j@NLO + 3j@LO. Here, + , − refers to either electrons or muons and j to a well separated parton. The merging cut Q cut is set to 20 GeV if not stated otherwise. The direct component and the standalone 4FS prediction are based on pp → + − bb matrix elements at next-to-leading order, matched to the parton shower using the formalism in [42].
For both components of the fusing approach, and for the standalone 5FS and 4FS predictions, all scales are evaluated according to the METS scheme with inclusive clustering 5 . In all predictions except for the standalone 4FS prediction the clustering is required to be fully ordered. The factorization and (where applicable) renormalization scales in the core process are evaluated according to for Zj, We use the NNPDF3.0 set [80] at NNLO with five active flavors for both components of the fusing approach and for the 5FS prediction, and with four active flavors for the 4FS prediction. These PDF sets are interfaced to Sherpa using LHAPDF [81]. We use the CS-Shower [35] as implemented in Sherpa with two minor modifications. Firstly, the strong coupling in g → bb splittings is evaluated at the virtuality of the intermediate parton, to account for the fact that there is no soft gluon emission, and therefore no higher-order corrections enhanced by α s /(2π)β 0 ln k 2 T /Q 2 . Secondly, we choose the evolution variables of Scheme 1 in [82], but we add the squared masses of the final-state partons in the branching. The default multiple interactions [83] and hadronization models [85] implemented in Sherpa are employed in all simulations. Analyses of the event samples are performed within the Rivet framework [86]. Scale variations (µ F , µ R ) are studied using the on-the-fly-variation method as implemented in Sherpa [87].

A. Validation in inclusive Z phase space
If the event selection does not explicitly require any b-jet, our fusing approach is expected to agree with the MEPS@NLO prediction. This is validated here with 7 TeV data from ATLAS [88]. In this measurement, Z bosons decaying to electrons or muons were measured in association with jets. Leptons are required to have a transverse momentum of p ⊥ > 20 GeV and a combined invariant mass with 66 GeV < m < 116 GeV. Jets are defined by the anti-k t algorithm with R = 0.4 with a transverse momentum of p j ⊥ > 30 GeV and a minimal angular distance to the leptons, ∆R(j, ) > 0.5.
In Fig. 3, the transverse momentum spectrum p Z ⊥ of the Z boson and the scalar sum of the jet transverse momenta, S T , are shown. As expected, the fusing prediction is dominated by the fragmentation component in this region of phase space. The full 4FS prediction still reaches up to 10 % of the cross section, showing the necessity for a rigorous  combination. In both distributions, the new prediction is compatible with the experimental data, and the agreement with the MEPS@NLO prediction demonstrates that the fusing algorithm does not induce any unexpected features.

B. Results in Zbb phase space
For the validation of our newly developed approach in a Zbb region we use 8 TeV data taken by CMS [89]. There, the Z boson is reconstructed from either two electrons or two muons in a mass window between 71 GeV and 111 GeV. These leptons are required to have p ⊥ > 20 GeV and |η| < 2.4, and are dressed with photons within a cone of ∆R < 0.1. Jets are defined by the anti-k t algorithm with R = 0.5, p ⊥ > 30 GeV and |η| < 2.4. Only jets with no overlap (∆R > 0.5) to leptons are taken into account. b-jets are identified by ghost association [90] and have to pass the same jet cuts as described above.
Again, we compare our newly obtained prediction to the experimental data and to predictions obtained in the 4FS and 5FS. In addition, we estimate the perturbative uncertainties and the uncertainties related to the merging scale. The former are given by 7-point variations of µ R and µ F 6 coherently in the matrix elements and the parton shower. The latter are studied by a variation of the merging cut to values of 15 or 30 GeV. The total cross sections for having at least one or at least two b-jets are displayed in Table I and differential distributions for one (two) b-jet observables are shown in Fig. 4 (Fig. 5).
In the one b-jet region, the predicted cross sections of the fused result and the 4FS prediction are in good agreement with the data, whereas the 5FS prediction exceeds it by 34 %. Differential cross sections for several distributions are given in Fig. 4. In general, the fused prediction is in good agreement with the data and in between the 4FS and 5FS predictions. Whereas the former slightly undershoots the data, the latter has a significantly larger cross section. This holds in particular for small transverse momenta of either the b-jet or the Z boson or if the b-jet is close to the Z boson. Both fusing components are equally relevant in all distributions, with a highly non-trivial phase-space dependence of their relative contributions. While they are relatively flat and equal in the transverse-momentum spectrum of the leading b-jet, the direct component dominates around the peak of p ⊥ (Z). A different composition is found in the S T distribution and for ∆Φ(Z, b). At large S T or small ∆Φ(Z, b) the fragmentation contribution takes over and the direct one only contributes with around 20 %. At the same time, the 4FS prediction undershoots the high S T region. This region is sensitive to a good modeling of multiple hard jets, which can only be predicted reliably by multi-jet matrix elements. This is the case in the fusing procedure, where in the fragmentation component the hard emissions are generated first by matrix elements and b quarks may be produced later on in the shower. In the 4FS prediction on the other hand, the b-quarks are always described by matrix elements and additional emissions of light partons by the parton shower can form hard jets in the end. Thus, the modeling of this region with a 4FS prediction becomes very sensitive to the parton shower which is applied outside its region of validity. The uncertainties related to the merging scale in the fragmentation component and the perturbative uncertainties are depicted in the lower ratio-plots in Fig. 4 for all distributions. The perturbative uncertainties are at the level of 15-20 % in all observables. Results with different merging cuts are all within the perturbative uncertainty band and in good agreement with each other. The Q cut = 15 GeV curve has significantly higher statistical uncertainties, which is a typical feature for multi-jet merged predictions with very low merging scales.
In the two-b-jet region, the total predicted cross sections of both the 4FS and 5FS are in good agreement with the experimentally measured ones. The fused prediction is slightly lower but still matches the data within the uncertainties. Differential distributions for this region of phase space are given in Fig. 5     closely the 4FS result but has a slightly smaller cross section in some bins. In all regions of phase space, the direct component gives the dominant contribution of the fused prediction. Only for b-jet pairs which are collimated or have a small invariant mass, the fragmentation component exceeds the 20 % threshold, which demonstrates the expected transition towards unresolved one-b-jet configurations. The perturbative uncertainties are reduced in comparison to the one-b-jet region. They are at the level of 5 % for low values of p Z ⊥ and reach up to 15 % for larger values. Again, all merging cut variations are within the perturbative uncertainty band, with the Q cut = 15 GeV curve again yielding large statistical uncertainties in some bins. It is worth noting that the fused prediction has a significantly smaller statistical uncertainty than the 5FS prediction, although the fragmentation component was generated with the same number of events. This is expected in all regions of phase space where the direct component dominates since only a small fraction of 5FS events will yield two b-jets. The direct component profits from its explicit production of heavy-flavor final states and can fill the phase space more efficiently.

VI. CONCLUSIONS
We have presented a novel event generation algorithm to simulate heavy-flavor associated production in collider experiments. Building upon the established merging algorithms for multi-jet matrix elements and parton showers, we propose a technique to include massive matrix elements for heavy-quark production, effectively leading to an MC simulation in a variable-flavor-number scheme, which we call fusing.
The overlap between the five-and four-flavor scheme calculations is removed based on a parton-shower interpretation of the full parton evolution from the hard scale to the parton shower cut-off. This evolution history is also used to supplement the massive matrix elements with all higher-order corrections necessary to maintain the logarithmic accuracy of the multi-jet merged calculation.
Our algorithm allows to combine the advantages of inclusive five-flavor scheme calculations with the higher precision of four-flavor scheme calculations in regions of phase space where the bottom quark mass sets a relevant scale. Such a combined prediction is crucial for heavy-flavor measurements in LHC experiments, since they will always be affected by the presence of fake heavy-flavor tagged jets.
The fusing algorithm can be applied at leading order or next-to-leading order QCD. Its logarithmic accuracy depends on the parton shower used in the merging and might be extended to NLL in the near future. The relation to the FONLL method has been established by analytically identifying the known FONLL matching coefficients within the fused parton shower expressions.
Using an implementation in the Sherpa event generator, we show a first application to heavy-flavor production in association with a Z boson. Cross checks of exclusive observables and a comparison of the results for heavy-flavor production to experimental data demonstrate the improvement over existing multi-jet merging algorithms.