Parton showers beyond leading logarithmic accuracy

Parton showers are among the most widely used tools in collider physics. Despite their key importance, none so far has been able to demonstrate accuracy beyond a basic level known as leading logarithmic (LL) order, with ensuing limitations across a broad spectrum of physics applications. In this letter, we propose criteria for showers to be considered next-to-leading logarithmic (NLL) accurate. We then introduce new classes of shower, for final-state radiation, that satisfy the main elements of these criteria in the widely used large-$N_C$ limit. As a proof of concept, we demonstrate these showers' agreement with all-order analytical NLL calculations for a range of observables, something never so far achieved for any parton shower.

Parton showers are among the most widely used tools in collider physics. Despite their key importance, none so far has been able to demonstrate accuracy beyond a basic level known as leading logarithmic (LL) order, with ensuing limitations across a broad spectrum of physics applications. In this letter, we propose criteria for showers to be considered next-to-leading logarithmic (NLL) accurate. We then introduce new classes of shower, for final-state radiation, that satisfy the main elements of these criteria in the widely used large-NC limit. As a proof of concept, we demonstrate these showers' agreement with all-order analytical NLL calculations for a range of observables, something never so far achieved for any parton shower.
High-energy particle collisions produce complex hadronic final states. Understanding these final states is of crucial importance in order to extract maximal information about the underlying energetic scattering processes and the fundamental Lagrangian of particle physics. To do so, there is ubiquitous reliance on general purpose Monte Carlo (GPMC) event generators [1], which provide realistic simulations of full events. A core component of GPMCs is the parton shower, a subject of much recent research . Partons refer to quarks and gluons, and a shower aims to encode the dynamics of parton production between the high-energy scattering (e.g. production of electroweak or new-physics states) and the low scale of hadronic Quantum Chromodynamics (QCD), at which experimental observations are made.
Typically parton showers are built using a simple Markovian algorithm that takes an n-parton state and stochastically maps it to an n + 1-parton state. The iteration of this procedure, e.g. starting from a 2-parton state, builds up events with numerous partons. A further step, hadronisation, then maps the partons onto a set of hadrons. Even though this last step involves modelling [29,30], many of the features of the resulting events are driven by the parton shower component which is, in principle, within the realm of calculations in perturbative QCD. This is because the showering occurs at momentum scales where the strong coupling, α s is small.
Much of collider physics, experimental and theoretical [31][32][33][34], is moving towards high precision, especially in view of large volumes of data collected so far at CERN's Large Hadron Collider (LHC). On the theoretical front many of the advances either involve approximations with a small number of partons, or else are specific to individual observables. Parton showers, in contrast, use a single algorithm to describe arbitrary observables of any complexity. This versatility comes at a cost: lesser accuracy for any specific observable and, quite generally, at best only limited knowledge [35][36][37][38] of what the accuracy even is for a given observable. In fact there is currently no readily accepted criterion for categorising the accuracy of parton showers. One novel element that we introduce in this paper is therefore a set of criteria for doing so.
The role of showers is to reproduce emissions across disparate scales. Our first criterion for accuracy starts by structuring this phase space: there are three phase space variables per emission, and two of them (e.g. energy and angle) are associated with logarithmic divergences in the product of squared matrix element and phase space. We define LL accuracy to include a condition that the shower should generate the correct effective squared tree-level matrix element in a limit where every pair of emissions has distinctly different values for both logarithmic variables. At NLL accuracy, we further require that the shower generate the correct squared tree-level matrix element in a limit where every pair of emissions has distinctly different values for at least one of the logarithmic variables (or some linear combination of their logarithms). Beyond NLL accuracy we would consider configurations with a pair of emissions (or multiple pairs) both of whose logarithmic variables are similar.
To help make this discussion concrete, let us consider showers that are not NLL accurate according to this criterion: angular ordered showers [39][40][41] do not reproduce the matrix element for configurations ordered in energy, but with commensurate angles, and this is associated with their inability to correctly predict α n s L n (NLL) effects for non-global observables [36]. Transversemomentum (k t ) ordered showers with dipole-local recoil [2,3,5,11,42,43] do not reproduce matrix elements for configurations ordered in angle but with commensurate transverse momenta, because of the way they assign transverse recoil [37]. As a result they fail to reproduce NLL effects in global observables such as jet broadenings. Showers that omit spin correlations fail to reproduce (the azimuthal structure of) matrix elements for configurations ordered in angle but with commensurate energies [44][45][46], and associated NLL terms.
Our second criterion for logarithmic accuracy tests, among other things, the overall correctness of virtual corrections. For showers that intertwine real and virtual corrections directly through unitarity, once the generation of tree-level matrix elements is set, there is only one (single-emission) degree of freedom that remains, namely the choice of scale and scheme for the strong coupling for each emission, as a function of its kinematics. To claim NLL accuracy, we will require the resulting shower to reproduce known analytical NLL resummations across recursively infrared and collinear safe (rIRC) [47] global and non-global two-scale observables as well as (subjet) multiplicities.
The challenge that we concentrate on here is to formulate showers that can handle each of two regions correctly: the energy-ordered, commensurate-angle region; and the angular-ordered, commensurate k t region. Recall that existing k t and angular-ordered showers can each handle one of these limits, but not both. Strictly, full NLL accuracy also requires attention to the angularordered, commensurate energy region. However, given that general solutions for the required spin correlations are known to exist [45,[48][49][50], and that they affect only a small subset of observables, we postpone their study to future work. For now, we also restrict our attention to final-state showers (i.e. lepton-lepton collisions), massless quarks and the large-N C limit. Our guiding principle will be that soft emissions should not affect, or be affected by, subsequent emissions at disparate rapidities.
The two classes of shower that we develop both consider emissions from colour dipoles. We consider a continuous family of shower evolution variables v, parameterised by a quantity β in the range 0 ≤ β < 1, where β = 0 corresponds to transverse-momentum ordering. The phase space involves two further variables besides v: a pseudorapidity-like variable within the dipole,η, and an azimuthal angle φ.
We start with a shower with dipole-local recoil (the PanLocal shower). Its mapping for emission of momen- where k ⊥ = k t [n ⊥,1 cos φ + n ⊥,2 sin φ], with n 2 ⊥,m = −1, n ⊥,m · p i/j = 0 (m = 1, 2), n ⊥,1 · n ⊥,2 = 0 and Here sĩ = 2 p i · p j , sĩ = 2 p i · Q, and Q is the total event momentum. The light-cone components of p k are given by The quantity f in Eq. (1) determines how transverse recoil is shared between p i and p j , cf. below. The a i , b i , a j , b j are fully specified by the requirements p 2 i/j = 0, (p i + p j + p k ) = ( p i + p j ) and p i =p i for k t → 0 and are given explicitly in Ref. [51].
In the event centre-of-mass frame,η = 0 corresponds to a direction equidistant in angle from p i and p j . For soft-collinear emissions, the physical pseudorapidity, η = − ln tan θ 2 , with respect to the emitter is η = |η| + 1 β ln ρ. Soft-collinear emissions from distinct dipoles but with the same ln v fall onto common contours in the Lund plane [52], k t = ve β|η| .
A key difference relative to earlier showers is that our transition in transverse recoil assignment between i and j takes place atη 0, i.e. equal angles between the p i and p j directions in the event centre-of-mass frame (note similarities with Deductor [9]). This differs from the common choice of a transition in the middle of the dipole centre-of-mass frame. Our choice ensures that a given emission will not induce transverse recoil in earlier, lower-rapidity emissions. Additionally, we require β > 0 in the definition of the ordering variable, Eq. (2). This causes emissions at commensurate k t and widely separated in |η| to be effectively produced in order of increasing |η|, so that any significant k t recoil is always taken from the extremities of a (hard) qg{. . .}gq dipole chain. Together, these two elements provide a solution to the problem observed in Ref. [ [51,53]; successively smaller αs values keep fixed αs ln kt1. Middle: the same for the PanGlobal(β = 0) shower. Right: the αs → 0 limit of the ratio for multiple showers. This observable directly tests part of our NLL (squared) matrix-element correctness condition. A unit value for the ratio signals success.
in common dipole showers causes multi-gluon emission matrix elements to be incorrect in the limit of similar k t 's and disparate angles, starting from α 2 s , leading to incorrect NLL terms.
Note that with dipole-local recoil, NLL correctness also requires β < 1, because with β ≥ 1 the kinematic constraint associated with fixed dipole mass means that a first emission cuts out regions of phase space for a second emission at similar ln v.
A second class of shower can be constructed with global, i.e. event-wide recoil (the PanGlobal shower). It can be formulated in largely the same terms as the dipolelocal recoil shower, but with a two-step recoil procedure. In the first step one sets The second step is to apply a boost and rescaling to the full event (including thep i,j,k momenta) so as to obtain final momenta {p} whose sum gives Q. This approach assigns transverse recoil dominantly to the most energetic particles in the event. Thus emission from a hard qg{. . .}gq dipole string transfers its recoil mostly to the hard q andq ends. This ensures that one reproduces a pattern of independent emission for commensurate-k t and angular-ordered gluons, while also retaining the correct (dipole) pattern for energy-ordered, commensurate angles. This holds even for β = 0, i.e. with k t ordering. Values of β ≥ 1 remain problematic, however. Note that the PanGlobal shower has power-suppressed routes to highly collimated events. These compete with normal Sudakov suppression, as observed also for Pythia8 [37]. We have verified that such effects are small even at the very edges of future (FCC-hh [59]) phenomenologically accessible regions. Nevertheless, ultimately one may wish to explore alternative global recoil schemes.
The next step is to compare our showers to NLL observables. Relative to earlier attempts at such comparisons [60], a critical novel aspect is how we isolate the structure of NLL terms α n s L n . For each given observable v, with L = ln v, we consider the ratio to the true NLL result in the limit α s ≡ α s (Q) → 0 with fixed α s L. This helps us isolate the NLL terms from yet higher-order contributions, which vanish in that limit. Numerically, a parton shower cannot be run in the α s → 0 limit for fixed α s L. However, with suitable techniques [51,[61][62][63], one can run multiple small values of α s and extrapolate to α s = 0. We examine not just our showers, but also our implementations of two typical k t -ordered shower algorithms with dipole-local recoil, those of Pythia8 [2] and Dire [11] (with the α s + Kα 2 s choice as in Eq. (4)). A first test concerns the multiple-emission matrix element. We have constructed our showers specifically so that they reproduce the squared matrix elements in the limits discussed above that are relevant for NLL accuracy. A simple observable for testing this is to consider the two highest-k t Lund-plane primary declusterings [64,65] with transverse momenta k t1 and k t2 (originally defined for hadronic collisions, the e + e − analogue is given in Ref. [51] and implemented with FastJet [66]). The α s → 0 limit for fixed α s L (L = ln k t1 /Q), ensures that the two declusterings are soft and widely separated in Lund-plane pseudorapidity η (which spans |η| |L| ∼ 1/α s ). In this limit the full matrix element reduces to independent emission and so the difference of azimuthal angles between the two emissions, ∆ψ 12 , should be uniformly distributed, for any ratio k t2 /k t1 (recall that strongly angular-ordered soft emission is not affected by spin correlations). We consider the ∆ψ 12 distribution in Fig. 1 The left-hand plot of Fig. 1 shows the Pythia8 dipole algorithm (not designed as NLL accurate), while the middle plot shows our PanGlobal shower with β = 0. The dipole result is clearly not independent of ∆ψ 12 for α s → 0, with over 60% discrepancies, extending the fixed-order conclusions of Ref. [37]. The discrepancy is only 30% for gg events (not shown in Fig. 1), and the difference would, e.g., skew machine learning [67] for quark/gluon discrimination. PanGlobal is independent of ∆ψ 12 . The right-hand plot shows the α s → 0 limit for multiple showers. The overall pattern is as expected: PanLocal works for β = 0.5, but not β = 0, demonstrating that with k t ordering it is not sufficient just to change the dipole partition to get NLL accuracy. Pan-Global works for β = 0 and β = 0.5.
Next, we consider a range of more standard observables at NLL accuracy. They include the Cambridge √ y 23 resolution scale [68]; two jet broadenings, B T and B W [69]; fractional moments, FC 1−β obs , of the energyenergy correlations [47]; the thrust [70,71], and the maximum u i = k ti /Qe −β obs |ηi| among primary Lund declusterings i. Each of these is sensitive to soft-collinear radiation as k t /Qe −β obs |η| , with the β obs values shown in Fig. 2 (right). Additionally, the scalar sum of the transverse momenta in a rapidity slice [72], of full-width 2, is useful to test non-global logarithms (NGLs). These observables all have the property that their distribution at NLL can be written as [47,53,[72][73][74] where Σ is the fraction of events where the observable is smaller than e L (g 1 = 0 for the rapidity slice k t ). We also consider the k t -algorithm [75] subjet multiplicity [51,76].
If it differs from 0, the point is shown as a red square. In some cases (amber triangles) it agrees with 0, though a fixed-order analysis in a fixed-coupling toy shower [37,51] reveals issues affecting NLL accuracy, all involving hitherto undiscovered spurious super-leading logarithmic terms [51]. 1 Green circles in Fig. 2 (right) indicate that the shower/observable combination passes all of our NLL tests, both at all orders and in fixed-order expansions. The four shower algorithms designed to be NLL accurate pass all the tests. These are the PanLocal shower (dipole and antenna variants) with β = 1 2 and the PanGlobal shower with β = 0 and β = 1 2 . To conclude, we have identified two routes towards NLL parton shower accuracy. One involves a modification of the evolution variable and dipole partition, while maintaining dipole-local recoil; the other replaces dipolelocal recoil with event-wide recoil. While further work is needed towards phenomenology, the results shown here represent the first time that individual parton showers are demonstrated to be able to reproduce NLL accuracy simultaneously for both non-global and a wide set of global observables. It is our hope that these results, together with our NLL criteria and validation framework, can provide the solid foundations needed for future development of logarithmically accurate showers. Acknowledgements.
We are grateful to Fabrizio Caola, Silvia Ferrario-Ravasio, Basem El-Menoufi, Alexander Karlberg, Paolo Nason, Ludovic Scyboz, Rob Verheyen, Bryan Webber and Giulia Zanderighi for help-ful discussions and comments on the manuscript. We thank each other's institutions for hospitality while this work was being carried out. This work has been funded by a Marie Sk lodowska Curie Individual Fellowship con-

SUPPLEMENTAL MATERIAL
The main purpose of this supplement is to summarise material that, while not critical to the message and understanding of the results of the manuscript, may be of benefit to those wishing to reproduce our results or consult additional evidence beyond the summary plot, Fig. 2.
Section 1 gives the fully worked out kinematic maps and the specific choices we made for the transition functions g(η) in Eq. (4). Section 2 explains the nature of our toy shower, a useful intermediate step for understanding transverse recoil effects and their impact on logarithmic accuracy. Section 3 explains the origin of the super-leading logarithmic terms found with that toy shower for standard dipole recoil. Section 4 outlines how we have adapted the hadron-collider definition of the Lund declustering [64] to the e + e − case, including the expected result for the normalisation of the ∆ψ 12 distribution. One of our test observables, the subjet multiplicity, differs in the structure of the logarithms from other observables, so that is summarised in Section 5. Finally, section 6 summarises technical challenges that arise when taking α s towards zero with finite α s L and solutions we adopted, which may be of interest to others who wish to embark on a similar path.

Explicit formulae for the PanLocal and PanGlobal showers
In this section we report the explicit equations for the kinematic maps and the branching probabilities of the two NLL showers given in the letter. All Jacobian factors associated with the kinematic maps given in this section have the property that they tend to unity in singular regions of the phase space. For this reason, we omit them in our phase space parametrisation.

a. The PanLocal dipole shower
The kinematic map for the emission of (4). The difference with the common dipole-local recoil scheme is that the transition takes place atη = 0, which corresponds to equal angles between p i and p j in the event centre-of-mass frame.
For simplicity, we give explicit formulas just for the f = 1 case, corresponding physically to a situation where p k is emitted from p i . The extension to f = 0 is straightforward. The coefficients a i,j and b i,j are given by where the coefficients a k and b k are defined in Eq. (3). The phase space boundaries, requiring the conservation of the dipole invariant mass and that all partons remain on shell, map to the following condition, The splitting functions Pĩ →ik (z) of the evolution equation (4) describe the splitting of the parton p i into the partons p i and p k . If p i is a gluon, in the large-N C limit its splitting function is shared between two contiguous dipoles such that and p i emits according to 1 2 P asym. g→X in each of the two dipoles. The splitting functions are given by where z is the momentum fraction (i.e. a k or b k ) of the emitted parton. The asymmetric versions of the splitting functions that we adopt are one of a continuous class of choices that we could have made that are positive definite and satisfy Eq. (9). We finally comment on the function g(η) in Eq. (4). The main constraint that one has is that g(η) + g(−η) = 1, and g(η) = 0 (1) for sufficiently negative (positive)η, while smoothly transitioning aroundη = 0. Several choices are of course possible, and we adopt the following one, η < −1 : g(η) = 0 ; −1 <η < 1 : g(η) = 15 16 We have in particular opted for a function that saturates (smoothly) at |η| = 1, so as to avoid unphysical recoil assignments leaking far in rapidity.

b. The PanLocal antenna shower
The antenna variant of the PanLocal shower differs from the dipole version just described in the distribution of the transverse recoil across the dipole, parametrised by the function f in Eq. (1). Specifically, we adopt the choice As a result, the explicit form of the mapping is considerably more involved than that for the PanLocal dipole shower.
For an emission p k off the dipole { p i , p j }, the coefficients a i,j and b i,j of Eq. (1) are where a k , b k are given in Eq. (3), and For f = 1 the above equations reproduce the map adopted for the PanLocal dipole shower (7). Moreover, analogously to the PanLocal dipole shower, the phase space boundaries are obtained by requiring The splitting functions are those defined in the previous section for the PanLocal dipole shower, while the function g(η) for the PanLocal antenna shower is chosen as g(η) = 1 − g(−η) = f (η).

c. The PanGlobal shower
The PanGlobal shower differs from the PanLocal design in that only the longitudinal recoil is distributed locally among the dipole ends as in Eq. (5). The transverse recoil is handled as follows. Having constructed the intermediate post-branching momenta as in Eq. (5), we evaluate their sum, and then rescale each momentum in the event by a factor to yield Finally we construct the following Lorentz boost and apply it to the latter, primed momenta to obtain the post branching kinematics: The phase space boundaries are simply set by imposing The splitting functions are given in Eq. (10), while for the g(η) function we adopt the same choice as for the PanLocal antenna shower, namely g(η) = 1 − g(−η) = f (η), with f (η) given in Eq. (12).

Toy shower formulation and results
The toy shower is intended to provide a simple way to determine expectations for logarithmic effects in showers with dipole-local recoil. The shower functions in a soft approximation, and uses a fixed coupling. Moreover, it only considers the emission of primary radiation, while neglecting any secondary branchings. These approximations help make the toy model sufficiently simple that it becomes a powerful tool for examining and understanding the interplay between event-shape type observables and parton showers.
Rather than storing 4-momenta, for each particle one stores a 2-dimensional transverse momentum and a rapidity and determines observables directly from those variables. The rapidity is to be understood in a primary Lunddeclustering sense, i.e. as related to the opening angle of the emission from its emitter.

a. Shower definitions
Our notation is as follows: p ⊥,i and η i are, respectively, the (two-dimensional) vector transverse momentum and the rapidity of particle i at a given stage of the evolution. We will use quantities with a hat,p ⊥,i andη i , to denote the values when the particle was first created. When discussing the effect of a specific branching on other emissions, p ⊥,i and η i will be the values before the branching, while p ⊥,i and η i will be those after the branching. Emissions are maintained in a dipole chain that is ordered in increasingη i . When we write a dipole as [ab] it is implicit that η a <η b .
We start from a qq system, where q (q) has infinite positive (negative) rapidity and zero transverse momentum, and an evolution scale ln v = 0 (we implicitly have a centre of mass energy set to Q = 1). The evolution variable v maps to lnp ⊥ andη as ln v = lnp ⊥ − β|η|, where the β-dependent term is relevant only for the PanLocal and PanGlobal showers. This form matches the kinematic pattern of primary-emission generation for the full showers, both dipole and PanLocal / PanGlobal. The probability of evolving from S n → S n+1 in a given slice d ln v of evolution variable is Given an ln v for emission n + 1, the same phase space allows one to selectη n+1 andφ n+1 and then one inserts the emission into the dipole chain, by identifying the partons that haveη immediately below and aboveη n+1 . This structure applies to all our toy showers and they differ only through the recoil prescription that is adopted when inserting an emission. a. Recoil for dipole showers. For k t ordered showers with dipole-local recoil (the toy analogue is as given already in Appendix B of Ref. [37]), like Dire and Pythia8, emission n + 1 will induce recoil in either the left or right-hand member of the dipole into which it is inserted. Generically we encode the recoil as (k, s) where k is the index of the recoiling parton and s the sign of its rapidity modification. The recoiling parton is modified as If k is quark or anti-quark, the rapidities are infinite and so the rapidity modification is irrelevant. For any given dipole, we define a mid-rapidity point Ifη n+1 is below (above) η mid then we take k in Eq. (23) to be the index of the dipole member at smaller (larger) rapidity and s = +1 (s = −1). Note that we have made an explicit choice to use thep variables in defining the midpoints in Eq. (24). The full shower is arguably closer to choosing thep variables instead. However in terms of probing single-logarithmic effects, both choices should give equivalent answers (they differ only in a region of logarithmic width O(1) close to the midpoint), andp variables are a little simpler computationally. Note also that we have a discrete transition in the recoil being assigned to one parton or the other, whereas full dipole showers have a smooth transition (though the Pythia8 and Dire algorithms implement that smooth transition differently). Again the difference is a sub-leading logarithmic effect, because it affects a region of rapidity width of order 1, i.e. without logarithmic enhancements.
b. Recoil for the PanLocal showers. If the emission has positive (negative)η, the recoil is assigned to the dipole member with more positive (negative)η, and with s = −1 (s = +1). In the full shower this is equivalent to saying that if the emission is forward (backward) in the event frame, the recoil is assigned to the more forward (backward) of the two dipole members. This is the same for both the dipole and antenna variants of the PanLocal shower.
c. Recoil for the PanGlobal shower. In the soft limit of the PanGlobal shower, transverse recoil from any given emission is assigned in equal fractions to the quark and anti-quark.
d. Recoil for the CAESAR reference. It will be useful to compare results to known resummations from the CAESAR approach [47]. The CAESAR reference results for recursively infrared and collinear safe global observables can be obtained by setting k to be the anti-quark (quark) forη n+1 < 0 (η n+1 > 0).

b. Observables
We have two main classes of observables, each parameterised by a quantity β which determines their angular dependence. (In the main text we have used β obs , to distinguish this parameter from the β used in the shower evolution variable; in this section, to keep the notation more compact, we drop the "obs" subscript, but with the understanding that β here, when used in the context of an observable, always means β obs .) One class, S β , is additive in the contributions from different emissions and at NLL accuracy it is equivalent to the energy-correlation moments F C 1−β (Appendix I of Ref. [47]) The thrust is equivalent to S β=1 at NLL accuracy. The other class takes the maximum across all emissions and at NLL accuracy the angular-ordered Durham (or Cambridge) √ y 23 = M β=0 . We have also defined left (right) hemisphere versions of these observables, S L,β and M L,β (S R,β and M R,β ), where we restrict the sum or maximum to particles with η i < 0 (η i > 0). Additionally, defining a hemisphere vector transverse sum V L,0 and analogously for V R,0 , we can write the jet broadenings [69] as Finally we define the scalar transverse momentum sum in a slice of half-width ∆ centred at zero rapidity evaluated with β = 0. Note that the toy shower does not capture non-global logarithms for slice and hemisphere observables, since the non-global logarithms are induced by secondary emissions (which, we recall, are not included in the toy shower). However, it is still of interest for these observables, because it can diagnose the presence of at least some classes of super-leading logarithms in showers with dipole-local recoil, as discussed below.

c. All-order results
The toy showers can be used as simplified models of the full showers, to derive quantitative expectations for their logarithmic accuracy. As an example, in Fig. 3 we show a comparison between some toy showers (the PanLocal shower with β = 0 and β = 1/2, and the Dipole shower) and the corresponding full showers for three observables: the angular-ordered Durham √ y 23 resolution scale (left plot), the wide jet broadening B W (middle plot), and a moment of energy-energy correlation FC 1 (right plot). Specifically, the three plots show the ratio Σ shower /Σ NLL in the limit α s → 0 as a function of λ = α s L. Σ NLL stands for the correct NLL prediction for each observable. In the toy shower case, NLL is to be understood in a sense that includes neither running coupling, nor hard-collinear effects, while it does include transverse recoil effects. The NLL prediction is obtained with the toy shower itself using the CAESAR-type recoil described above, though we have also checked that it is consistent with analytical calculations that use the same simplifications. Given that the toy shower doesn't include running coupling, a translation is needed between α s L in the full shower and that in the toy shower. For global observables with β obs = 0, as shown in Fig. 3, that translation is unambiguous and is given by . In all cases in Fig. 3 we observe that the prediction of the toy shower agrees well with the full shower result. An exception is given by the small |λ| region where some small (0.5%) discrepancies between the toy and full shower are due to the fact that, for some of the α s values used in the α s → 0 extrapolation, the value of |L| is not sufficiently large for non-logarithmic effects to be entirely negligible in this region (e.g. relative e L contributions).
We have carried out analogous tests for all of the observable/shower combinations analysed in this article. For the β obs = 0 cases, one can perform a direct comparison between the full shower and the toy shower using Eq. (30) and one obtains good agreement. For β obs > 0, there is no direct translation for running-coupling effects, however these cases all show agreement with NLL, both in the full and toy showers.

d. Fixed-order results
In all-order results, whether for the full shower or the toy shower, a key difficulty is to unambiguously sort through different logarithmic orders. The limit α s → 0 that we use is essential in this respect, but the accessible range of α s values remains limited, and a fixed-order expansion provides an alternative way of identifying parametrically dominant logarithmic terms that might come with a small coefficient.
The expansion of the cross section Σ(L) = n=0ᾱ n Σ n (L) for an observable V to be below some threshold e L can be written as follows where dΦ i is given by The insertion operator I(p real 1 , . . . , p real m ) in Eq. (31) inserts the emissions in order of decreasing ln v = lnp ⊥,i − β|η i | with the appropriate recoil prescription for the given shower, e.g. Eq. (23) for dipole showers.
A direct evaluation of Eq. (31) leads to terms with up to 2n logarithms for the coefficient ofᾱ n , from the exponentiation ofᾱL 2 structures. For observables that exponentiate, and at fixed coupling, these terms disappear in ln Σ(L), leaving at most termsᾱ n L m with m ≤ n. In a numerical (Monte Carlo) calculation of ln Σ, one could evaluate individual terms at different orders in theᾱ expansion of Σ(L) and then combine them to obtain the expansion of ln Σ(L). However this would lead to large cancellations betweenᾱ n L 2n terms coming from Monte Carlo calculations at distinct orders, with uncorrelated statistical errors.
Instead, we take the approach of directly evaluating the expansion of where Σ 1 (L) is the coefficient ofᾱ in theᾱ series expansion of Σ(L). A necessary (but not sufficient) condition for an NLL-correct shower, in the fixed-coupling approximation of our toy model, is that F should only have termsᾱ n L m with m ≤ n, like ln Σ(L). Note that with running coupling there would be termsᾱ n L n+1 , and it would make more sense to use an analogue of F in which the exponential pre-factor was adjusted for running coupling effects. The non-trivial result starts at second order, and writing F = nᾱ n F n , the first few terms are where we have introduced the shorthand In practice we evaluate the difference between a given shower and the CAESAR result δF n (L) ≡ F shower n (L) − F CAESAR n (L), so as to remove known NLL terms. In a NLL-correct shower, the δF should at most have contributions α n L m with m < n. It will be convenient to study δF n /L n , which should go to zero for NLL-accurate showers for large negative L. If it tends to a non-zero constant, that will signal NLL failure. A subset of results for dipole showers is shown in Fig. 4. The left-hand column of plots is for the thrust observable and shows δF 3 (L)/L 3 (top) and δF 4 (L)/L 4 (bottom) as a function of L. At order α 3 s one sees that the result tends to a constant, signifying an α 3 s L 3 term and NLL failure (as reported in the revised version of Ref. [37]). Rather surprising, however, is that δF 4 /L 4 (lower-left plot of Fig. 4) appears to have a linear behaviour at large negative L, signalling a term α 4 s L 5 . Such terms are super-leading, in the sense they are larger than any term that should be present in F (L) (or in ln Σ) for rIRC safe [47] observables in our fixed-coupling approximation. 2 We have found such terms to be present for dipole showers for all global observables with β obs > 0 and there is strong reason to believe that they are related to sub-leading terms α 2 s L being enhanced by powers of α s L 2 , giving α n s L 2n−3 . The toy shower cannot accurately predict the coefficients of all such terms in the full shower, because they are affected also by secondary radiation (while the toy model has only primary radiation). However the conclusion that there are such terms is, we believe, robust.
Let us now examine two non-global observables: the transverse momentum in a slice (S slice β=0,∆=1 ) and a hemisphere √ y 23 (angular-ordered Durham) jet resolution parameter (M R,β=0 ). The toy-shower F n (L)/L n results are shown in the middle and right-hand columns, respectively, of Fig. 4. The slice looks similar to the thrust case, with α 3 s L 3 and α 4 s L 5 terms. The hemisphere √ y 23 (M R,β=0 ) observable has α 3 s L 4 and α 4 s L 6 terms. The fact that this observable has one additional logarithm makes its analytical calculation somewhat easier (cf. section 3 below). Note that the toy shower is not, in general, suitable for evaluating single-logarithmic (α n s L n ) terms for non-global observables. However, once again, the existence of such terms in the toy model signals their existence also in the full shower.
It is natural to ask why we do not see the impact of super-leading logarithms in our all-order results. This will be easier to discuss below, once we have explained their origin in detail.
We close this section by illustrating the kind of result that one expects to obtain for a NLL-correct shower. This is shown in Fig. 5 for the PanLocal β = 1/2 shower, again at third and fourth order. In all cases, δF n /L n tends to zero for large negative L, as required for NLL correctness.

Super-leading logarithms
a. Hemisphere max p ⊥ The simplest observable with which to understand super-leading logarithms is M R,β=0 (or just M R,0 for brevity), the maximum p ⊥ of emissions in the right hemisphere, because the super-leading terms are visible already from order α 3 s . In the toy-model approach involving only soft primary emissions and fixed-coupling, the correct all-orders result for this observable is Σ = e −ᾱL 2 /2 , amounting to exponentiation of the leading-order result. The full QCD NLL result would additionally require inclusion of hard-collinear single-logarithmic corrections, a resummation of non-global logarithms, and running coupling effects. Such contributions are not included in our toy shower and so do not enter into our discussion below.
We are interested here in the fixed-order results for the difference, δF , between dipole showers and the correct NLL result (i.e. the CAESAR result from section 2). Recall that F was defined in Eq. (33) and that for n ≤ 3 δF n is equivalent to δ(ln Σ) n . A NLL discrepancy between the dipole shower and the correct result would reveal itself via single-logarithmic terms α n s L n in δF , while for a NLL correct shower δF should contain terms that are at most O α n s L n−1 . We start by examining order α 2 s , where M R,0 already reveals a NLL discrepancy, due to the recoil issue, i.e. one finds an α 2 s L 2 term in the difference between the dipole shower and the correct NLL result. To calculate the coefficient of this α 2 s L 2 term we use the method we introduced in Ref. [37] and examine dipole showers with evolution variable v = p ⊥ . We consider two emissions with values of the shower evolution variablep ⊥,1 andp ⊥,2 respectively, witĥ p ⊥,1 >p ⊥,2 = ζp ⊥,1 . The NLL order α 2 s discrepancy arises from a situation where the first emission hasη 1 > 0, i.e. is emitted in the right hemisphere H R , and has its transverse momentum modified by receiving recoil from the second emission also emitted in the right hemisphere. (The contribution from p 2 in the left-hand hemisphere vanishes after azimuthal integration.) The second emission then must have rapidityη 2 < 1 2 (η 1 − lnp ⊥1 ) (cf. Eq. (24)), and the transverse momentum for the first emission is modified so thatp ⊥,1 → p ⊥,1 =p ⊥,1 1 + ζ 2 − 2ζ cos φ. We then FIG. 6. Figure showing emission p1 in the left hemisphere and emissions p2 and p3 in the right hemisphere. On the left we show region a) where η3 > η2 while on the right we show region b) where η2 > η3 > 0. In region b) the condition for p3 to give recoil to p2 depends on p ⊥,1 and η1, which gives rise to a mismatch with the case where p1 is virtual, resulting in a super-leading logarithm as explained in the text.
obtain, for the difference between the dipole shower and the correct result, whereᾱ = 2C F α s /π and v is the maximum allowed value of M R,0 . There are also configurations wherep 1 is close to the hemisphere boundary, and gets pulled in/out by an emissionp 2 in the unmeasured hemisphere. This could contribute at α 2 s L 2 , but after azimuthal averaging yields zero. Defining f max = max ζ, 1 + ζ 2 − 2ζ cos φ and L = ln v and carrying out the required integrals gives the result Now we turn to order α 3 s . To obtain δ ln Σ 3 we need to consider p ⊥ ordered parton configurations with three real emissions as well as those with two real emissions and a virtual emission. At this order we havē where we note the presence of anᾱ 3 L 4 term which arises from the product of the double-logarithmic leading-order term,ᾱΣ 1 = − 1 2ᾱ L 2 + O (ᾱL), and the recoil discrepancyᾱ 2 δΣ 2 in Eq. (37). For the discrepancy between dipole showers and the NLL result to be at most single-logarithmic, we should see that thisᾱ 3 L 4 term cancels against a similar term inᾱ 3 δΣ 3 .
In order to obtain theᾱ 3 L 4 term fromᾱ 3 δΣ 3 we first consider the same two-parton configuration that gives thē α 2 L 2 term forᾱ 2 δΣ 2 , alongside an additional virtual emission. We label the virtual emission p 1 , the two real emissions p 2 and p 3 and consider the orderingp ⊥,1 p ⊥,2 >p ⊥,3 so that p 2 absorbs recoil from the emission of p 3 . Integrating over the full rapidity range ln 1/p ⊥,1 >η 1 > − ln 1/p ⊥,1 and overp ⊥,1 in the range 1 >p ⊥,1 >p ⊥,2 , produces a factor − ln 2p ⊥,2 , with the minus sign accounting for unitarity. It is simple to repeat the calculation that yields δΣ 2 , using emissions p 2 and p 3 and multiplied by this additional factor accounting for the virtual emission, to obtain Next we consider the situation where emission p 1 is real. To produce theᾱ 3 L 4 term we require that emission p 1 , which has the largest transverse momentum and is emitted first in the shower, should not affect the observable, which is the case when p 1 ∈ H L i.e.η 1 < 0. We also require that p 3 gives recoil to p 2 rather than to p 1 , since we are again examining situations that lead to a difference between dipole showers and the correct result.
There are then two distinct regions to be considered. In region a) we have the situation shown on the left in Fig. 6, whereη 3 >η 2 . In this situation p 3 can only give recoil to p 2 and one is free to integrate overη 1 andp ⊥,1 such that p 1 is in H L . The integral overη of all emissions and overp ⊥,1 gives On the other hand for region b) shown on the right of Fig. 6, the condition that p 2 receives recoil from p 3 is affected by the presence of p 1 . In this region the rapidity constraints are more involved and the integral overη 1 ,p ⊥,1 cannot be performed independently as was the case in region a) and also for the virtual correction. The integral overη of all emissions and overp ⊥,1 now gives Combining the contributions from Eqs. (40) and (41) we can perform the remaining integrals overp ⊥,2 , ζ and φ to obtain (42) Adding together real and virtual contributions, neglecting any O ᾱ 3 L 3 terms, we then have δΣ 3 = δΣ all real 3 + δΣ p1 virtual 3 and using Eq. (38) we are left with a surviving super-leading logarithmic contribution: in agreement with the fit in Fig. 4, to within errors. At next order in α s , one needs to consider up to two emissions in the unobserved hemisphere, each of which brings a factor α s L 2 , leading to the α 4 s L 6 term seen in Fig. 4.

b. Considerations at all orders
It is perhaps surprising that observables with super-leading logarithms in the toy shower should resum to give apparent good agreement with NLL in the full shower (cf. amber triangles in Fig. 2 right). Fig. 7 shows the toyshower all-order result for one of these observables, the thrust, so that we can compare fixed-order and all-order results within a single framework (using the same underlying shower insertion code, as described in section 2). The left-hand plot shows the usual Σ dipole /Σ NLL ratio as a function ofᾱL for severalᾱ values, together with its extrapolation tō α = 0. The extrapolation is consistent with unity, i.e. apparent all-order agreement with the NLL result, as in the full shower. Note that forᾱL = −1.5, theᾱ 3 L 3 andᾱ 4 L 5 terms from Fig. 4 would respectively contribute −0.054 and −0.87 to theᾱ = 0.01 (L = −150) line in Fig. 7 (left). The all-order results undoubtedly have the statistical power to resolve such effects, yet do not show any sign of them.
The right-hand plot of the same figure shows (Σ dipole /Σ NLL − 1)/ᾱ and its extrapolation toᾱ = 0. This serves as a verification that in this specific limit (i.e.ᾱL fixed andᾱ → 0, implyingᾱL 2 → ∞) any all-order discrepancies with respect to the NLL result mimic a standard NNLL, or even higher order, correction.
The presence of super-leading logarithms that evade detection at all orders is a particularly unpleasant characteristic of dipole showers, because it risks giving a false sense of security as to the validity of the underlying logarithmic structure. An analytic study of the all-order resummation of the super-leading logarithms is beyond the scope of this manuscript. However, a reader wishing to understand how an apparently large effect at fixed order seemingly vanishes at all orders, could consider the following argument. For all the amber triangles in Fig. 2, one contribution to the super-leading logarithms comes from anᾱ 2 L (NNLL) contribution promoted by additional factors ofᾱL 2 . Theᾱ 2 L term arises when a first emission a, contributing a factorᾱ, absorbs recoil from a second, unresolved, emission b with commensurate p ⊥ . Integrating over the rapidity of the second emission yields a factorᾱL, giving the overallᾱ 2 L. TheᾱL 2 enhancement factor that arises at next order comes about because there is a double logarithmic region for an emission c withp ⊥,c p ⊥,a that alters whether b can induce recoil for a (for example ifη c <η b , then one has a dipole chain (a − c − b) and a will not receive recoil from b). At all orders, the typical rapidity extent (|η a −η b |) in which one can have an a − b dipole without any other higher-p ⊥ particles in between can become of order either 1/ √ᾱ or 1, depending on the context. This causes the originalᾱ 2 L factor to have L replaced at all orders by 1/ √ᾱ or 1 respectively, givingᾱ 3/2 orᾱ 2 , i.e. even smaller than NNLL (which itself can arise from a multitude of sources). In this section we introduce the definition of the azimuthal separation between two Lund-plane declusterings [64] in e + e − → jets events. This observable has been used in the letter to test the azimuthal dependence of the effective double-soft strongly angular-ordered squared amplitude in different showers. A proper definition of the azimuthal angle ψ of a declustering p i → p j +p k requires the introduction of a dynamic reference axis such that ψ is always defined with respect to the direction of p i . We introduce the reference axes n x = (1, 0, 0), n y = (0, 1, 0), and n z = (0, 0, 1). We denote a generic rotation matrix R, which rotates the z axis to a direction d that has an inclination θ and an azimuth φ, We start with a configuration with two hard jets j 1 and j 2 in the event, and we set the initial reference direction as d ev ∝ j 1 − j 2 , with j 1 being the jet with the larger (positive) value of p z , and we normalise | d ev | = 1. We define the initial rotation matrix R 0 = R( d ev ), and we assign a sign s = 1 to the direction of the jet j 1 (with the larger z component), and a sign s = −1 to the direction of j 2 . Consider a declustering p i → p j + p k , primary or secondary, which we label as being step n in the declustering sequence constructed with the Cambridge algorithm [68] (four momenta are combined according to the standard E scheme, i.e. a 4-vector sum [66]). We introduce the reference directions of the declustering, We then determine an updated rotation matrix as where R n−1 is the rotation matrix relative to the previous declustering stage. We finally evaluate ψ for this declustering as ψ = atan2(δ u kj .R n . n y , δ u kj .R n . n x ) .
The Lund-plane declustering technology allows the inspection of the structure of the primary and secondary radiation in the event. We consider the two highest-p ⊥ primary Lund declusterings as an IRC safe proxy for the two leading (i.e. highest-p ⊥ ) primary emissions, which are the main element of NLL resummations for global observables. The azimuthal separation |∆ψ 12 | used in Fig. 1 is simply defined as the absolute value of the difference between the ψ angles of the two leading declusterings. Specifically, we show the |∆ψ 12 | distribution for a given bin of the leading declustering (k t1 ), with an additional constraint on the next highest p ⊥ primary declustering (k t2 ) of the form We consider the distribution normalised to the NLL vetoed cross section Σ(k t1 ) given in refs. [53]. The NLL prediction for this observable reads where λ = α s ln(Q/k t1 ), and β 0 = (11C A − 2n F )/(12π).

Subjet multiplicities
The scaling of particle multiplicities with centre-of-mass energy is one of the few observables that has long been included among the logarithmic structure tests routinely carried out for parton showers [35]. Relative to the infrared collinear-safe global (shape-like) observables and the non-global slice observable that we have concentrated on for NLL accuracy tests here, it has the interesting feature of being sensitive to the full nested soft-collinear multiple branching structure. Multiplicity itself is not infrared and collinear safe, however subjet multiplicities for specific jet algorithms are. We therefore concentrate on the latter. They are known to NLL accuracy [76] for the e + e − Durham (k t ) jet algorithm [75]. Whereas for global shape-like observables NLL implies control of terms α n s L n in ln Σ, NLL for multiplicities implies control of terms α n s L 2n−1 in the multiplicity itself. Accordingly we take the limit α s → 0 for fixed α s L 2 (rather than fixed α s L). Therefore, for an N p LL result, we should control terms suppressed by α p/2 s relative to the LL result. With L = 1 2 ln y cut , we consider N subjet shower (α s , α s L 2 )/N subjet NLL (α s , α s L 2 ) − 1 √ α s (50) and in particular take its α s → 0 limit while keeping α s L 2 fixed. Eq. (50) should vanish as √ α s if the shower is NLL accurate. Fig. 8 (left) illustrates the multiplicity as a function of 1 2 α s (Q) ln y cut for two values of α s , comparing the PanLocal β = 0.5 shower (in its dipole variant) to the NLL result. The right-hand plot shows Eq. (50) for the same shower, for three values of α s as well as the extrapolation to α s = 0. It illustrates the good agreement across the range of 1 2 α s (Q) ln y cut values (with the usual exception of the region close to 0, which is not sufficiently asymptotic). Other showers give similar results. Note that the α s → 0 extrapolation is more delicate for the multiplicity than for other observables, because of the effective expansion in powers of √ α s rather than α s . This implies a need for a broader range of α s values in order to obtain reliable results.

Considerations for αs → 0 limits of showers
To reach a conclusion about NLL correctness of showers, it has been crucial for us to be able to disentangle NLL terms from NNLL and yet higher-order contributions. This was achieved by considering the α s → 0 limit of Σ obs shower (α s , α s L) Σ obs NLL (α s , α s L) , for each given observable at fixed α s L. The requirement of small α s implies large L. Other than for the subjet multiplicity studies discussed in the previous section, the smallest α s values used in producing Fig. 2 were either 0.005 or 0.01, depending on the specific observable and shower.
Consider for now the smallest value, α s = 0.005. To achieve α s L = −0.5, one needs L = −100. In practice we typically add an event generation buffer of B = 18 units of the logarithm below the value of interest for the observable. The limit on the precision on the observable distribution is then expected to be proportional to e −B . The logarithm of the span of scales in the event generation, roughly −118, takes us into a regime that is far beyond that needed for parton showers in normal phenomenological contexts and introduces a number of challenges. In what follows, we outline some of those challenges and the main solutions we have adopted.

a. The challenges
Particle multiplicities. The logarithm of the particle multiplicity N in an event scales as √ α s L 2 . As we take α s → 0 with α s L fixed, α s L 2 = (α s L) 2 /α s grows large and particle multiplicities become huge. With α s = 0.005 and generation down to ln k t /Q = −118, the average multiplicity, N , is of the order of 10 5 . For computational strategies in event generation and subsequent analysis that scale as N this is at the edge of accessibility. Where they scale as N 2 (or worse) it can well be beyond it.
Numerical precision. Covering a range of momenta that stretches from 1 (in units of the centre-of-mass energy Q) down to e −118 6 · 10 −52 poses challenges. This is particularly the case for the calculation of the dot products that arise, for example, in Eqs. Size of cross sections. The cross section Σ(L) for (say) the √ y 23 observable to be below e L scales as Σ(L) exp[−2C F α s L 2 /π] at fixed coupling. Including full LL and NLL terms with running coupling, for α s = 0.005 and α s L = −0.5 one obtains Σ(L) ∼ 10 −29 . If one uses unweighted events, then a precision determination of Σ(L) requires that one needs to generate substantially more than 1/Σ(L) events. This is simply not feasible.

b. Solutions adopted
We have taken a hybrid approach to these challenges. Firstly, insofar as possible, we have attempted to limit the challenge. For example we use techniques similar to those of Ref. [63] supplemented with a Roulette Wheel [62] choice of dipole to eliminate the N 2 scaling for all showers except the PanGlobal one. (We believe N 2 scaling for the PanGlobal showers could perhaps also be eliminated with suitable caching, but so far this wasn't a critical necessity).
To address the precision issue, in the small-angle limit we use cross-products rather than dot products, which generically remain viable down to angles of rather than √ . A further improvement is to align the initial qq event along the z axis. Then there is almost no limit on the smallest angle that can be calculated with respect to the original qq direction, and when the shower produces an emitter at an angle θ with respect to the z axis, the smallest angle that can subsequently be calculated with respect to that emitter is θ. When these techniques reach their limit, we switch to the use of dd real and qd real types [61], though thanks to other advances below, this was necessary mainly for code testing and for the runs used to determine average multiplicities. One could also envisage techniques to handle arbitrarily small angles fully in double precision. A second element of our approach is to avoid generating emissions in regions that are of no relevance to a given study. We make an estimate for each emission of the most it could contribute to that observable (L approx = ln k t − β obs |η| with η =η + 1 β ln ρ for the PanScales showers, ignoring distinctions between primary and secondary emissions) and then avoid generating emissions whose contributions would be below e −B times the largest contribution so far in the shower. This dynamic cutoff reduces multiplicities to a manageable level and also keeps us in a regime where we can carry out calculations in double precision rather than dd real and qd real.
One potential concern about the dynamic cutoff is that while it is safe from the point of view of the global observables (because they are rIRC safe), it does cut out phase-space regions associated with the generation of superleading logarithms in dipole showers. However the fact that our full-shower results (with the dynamic cutoff) agree with the toy-shower results (which use the full primary phase space, i.e. no dynamic cutoff) gives us some reassurance that the dynamic cutoff is not generating misleading all-order conclusions in Fig. 2. Ultimately part of the reason that there is no issue here is that the super-leading logarithms are anyway strongly reduced by all-order resummation effects. This parametrically suppresses their effect in the α s → 0 limit and restricts the relevant phase-space region to a band of width 1/ √ α s below the observable boundary, a width that we account for by the generation buffer B.
The final element that we bring to the problem is weighted event generation. When β PS = β obs , this is fairly straightforward (to avoid ambiguity in this paragraph, we add a "PS" subscript to the β used in the parton shower ordering variable). We split event generation into bins of the first-emission ln v. For a given bin we include an event weight proportional to the Sudakov above the upper edge of the ln v bin. When β PS = β obs there is a strong correlation between the shower variable ln v for the first emission and the value of the observable and so the weight distribution for a given observable bin is relatively compact. This approach works less well when β PS = β obs and in some challenging cases (notably β PS = 1 2 , β obs = 0) further refinements are needed. There, we choose a bin for the highest L approx that we will accept, start the shower with the least negative ln v that can be in that bin (with an appropriate Sudakov weight) and then immediately reject an event if at any stage in its showering one produces an emission with an L approx that is larger than the upper edge of the bin. This rejection tends to happen in the early stages of the generation, so there is a limited cost to generating the (many) events that ultimately get rejected. It is conceivable that there might be more sophisticated approaches, but this one was sufficient in order to reach the few-per-mil precisions needed on the α s → 0 extrapolations at fixed α s L for Fig. 2, though it only just allowed us to explore the region down to α s L = −0.5 for β PS = 0.5 showers.
Note that extrapolations to α s = 0 are performed using a polynomial of degree n − 1 for n values of α s .