Asymmetric jet clustering in deep-inelastic scattering

We propose a new jet algorithm for deep-inelastic scattering (DIS) that accounts for the forward-backward asymmetry in the Breit frame. The Centauro algorithm is longitudinally invariant and can cluster jets with Born kinematics, which enables novel studies of transverse-momentum-dependent observables. Furthermore, we show that spherically-invariant algorithms in the Breit frame give access to low-energy jets from current fragmentation. We propose novel studies in unpolarized, polarized, and nuclear DIS at the future Electron-Ion Collider.

Introduction. Understanding the structure of nucleons and nuclei in terms of quark and gluons remains an open goal. Jet production in deep inelastic scattering (DIS) provides an excellent tool for this endeavor. The future Electron-Ion Collider (EIC) [1] will produce the first jets in polarized and nuclear DIS, which will enable a rich jet program .
The HERA jet measurements in DIS targeted gluoninitiated processes by requiring large transverse momentum in the Breit frame [44]. This suppresses the Born configuration, γ * q → q, which has recently been postulated as key to probe transverse-momentum dependent (TMD) PDFs [11][12][13]. Complementary to semi-inclusive DIS observables, jets avoid nonperturbative TMD fragmentation functions. Moreover, modern jet substructure techniques [45] offer new methods for precise QCD calculations and to control nonperturbative effects, e.g grooming or a recoil-free axis can be used to minimize hadronization effects or study TMD evolution [46]. These techniques also provide new ways to connect to lattice QCD calculations, e.g. of the nonperturbative Collins-Soper kernel [47,48].
The Breit frame plays a central role in jet clustering for DIS [49], and it allows for a factorized TMD crosssection in terms of the same soft and un-subtracted TMD functions as in Drell-Yan and e + e − → dihadron/dijet processes [11][12][13]. However, the longitudinally-invariant (LI) algorithms commonly used in DIS cannot cluster jets that enclose the beam axis given by the proton/photon direction (see Fig. 1).
In this letter, we introduce a new jet algorithm that is longitudinally invariant but can capture jets close to the Born configuration in the Breit frame. In addition, we use spherically-invariant (SI) algorithms to study the jet energy spectrum and find that they can separate the current and target fragmentation region even for soft jets.
x M p D Z S a B p 7 p D K g e q 9 / e X P z L 6 y X a r 7 s p D + N E Y 8 i W i / x E E B 2 R + d 9 k y C U y L a a G U C a 5 u Z W w M Z W U a Z N O w Y S w + p T 8 T 9 p V u 3 J h V 2 9 q 5 U Y p i y M P J 1 C C M 6 j A J T T g G p r Q A g Y j e I R n e L G E 9 W S 9 W m / L 1 p y V z R z D D 1 j v X + / 8 j X g = < / l a t e x i t > x M p D Z S a B p 7 p D K g e q 9 / e X P z L 6 y X a r 7 s p D + N E Y 8 i W i / x E E B 2 R + d 9 k y C U y L a a G U C a 5 u Z W w M Z W U a Z N O w Y S w + p T 8 T 9 p V u 3 J h V 2 9 q 5 U Y p i y M P J 1 C C M 6 j A J T T g G p r Q A g Y j e I R n e L G E 9 W S 9 W m / L 1 p y V z R z D D 1 j v X + / 8 j X g = < / l a t e x i t > Finally, we suggest novel studies of jet energy and TMD observables.
Notation and DIS kinematics. In the Breit frame, the virtual photon momentum is given by: where n µ ≡ (1, 0, 0, +1) andn µ ≡ (1, 0, 0, −1). The proton momentum (up to mass corrections) is: with Bjorken x B ≡ Q 2 /(2 q ·P ). At Born level, the struck quark back-scatters against the proton and has momentum (x x B ): The fragmentation of the struck-quark yields a jet that points to the beam direction. The algorithms we introduce below are designed to capture this jet. We define the scaling variable: At leading-logarithmic accuracy, z jet is the fraction of the struck-quark momentum carried by the jet.  i v s H h 0 c l 9 / i k r a V R h L a I 5 F J 1 I 6 w p Z 4 K 2 g A G n 3 V R R n E S c d q J J Y + 5 3 H q j S T I p 7 m K Y 0 T P B I s J g R D F Y a u K U + 0 E f I G l Q A N k r O B m 7 F r / o L e O s k y E k F 5 W g O 3 K / + U B K T 2 A c I x 1 r 3 A j + F M M M K G O F 0 V u w b T V N M J n h E e 5 Y K n F A d Z o v F Z 9 6 5 V Y Z e L J U t A d 5 C / T 2 R 4 U T r a R L Z z g T D W K 9 6 c / E / r 2 c g v g k z J l I D V J D l R 7 H h H k h v n o I 3 Z I o S 4 F N L M F H M 7 u q R M V a Y g M 2 q a E M I V k 9 e J + 1 a N b i s 1 u 6 u K v V y H k c B n a E y u k A B u k Z 1 d I u a q I U I M u g Z v a I 3 5 8 l 5 c d 6 d j 2 X r h p P P n K I / c D 5 / A H G a k 3 8 = < / l a t e x i t > anti-k T (SI) < l a t e x i t s h a 1 _ b a s e 6 4 = " B h V B W 5 f K b I U W 9 K R U 6 q b x E w J + o M 0 = " > A A A C B X i c b V C 7 S g N B F J 2 N r x h f q 5 Z a D A l C L A y 7 U d A y Y K N d x L w g W Z b Z y S Q Z M j u 7 z N w V w 5 L G x l + x s V D E 1 n + w 8 2 + c P A q N H r h w O O d e 7 r 0 n i A X X 4 D h f V m Z p e W V 1 L b u e 2 9 j c 2 t 6 x d / c a O k o U Z X U a i U i 1 A q K Z 4 J L V g Y N g r V g x E g a C N Y P h 5 c R v 3 j G l e S R r M I q Z F 5 K + 5 D 1 O C R j J t w 8 7 w O 4 h J R L 4 y R g P / R o u z p T b 6 / G x b x e c k j M F / k v c O S m g O a q + / d n p R j Q J m Q Q q i N Z t 1 4 n B S 4 k C T g U b 5 z q J Z j G h Q 9 J n b U M l C Z n 2 0 u k X Y 3 x k l C 7 u R c q U B D x V f 0 6 k J N R 6 F A a m M y Q w 0 I v e R P z P a y f Q u / B S L u M E m K S z R b 1 E Y I j w J B L c 5 Y p R E C N D C F X c 3 I r p g C h C w Q S X M y G 4 i y / / J Y 1 y y T 0 t l W / O C p X 8 P I 4 s O k B 5 V E Q u O k c V d I W q q I 4 o e k B P 6 A W 9 W o / W s / V m v c 9 a M 9 Z 8 Z h / 9 g v X x D c S g l / 0 = < / l a t e x i t > anti-k T (LI) < l a t e x i t s h a 1 _ b a s e 6 4 = " + u d X y l v X R e 8 p X a r l z a Q S I g P d j 0 Q = " > A A A C B X i c b V C 7 S g N B F J 2 N r x h f q 5 Z a D A l C L A y 7 U d A y Y K N g E S E v S J Z l d j J J h s z O L j N 3 x b C k s f F X b C w U s f U f 7 P w b J 4 9 C o w c u H M 6 5 l 3 v v C W L B N T j O l 5 V Z W l 5 Z X c u u 5 z Y 2 t 7 Z 3 7 N 2 9 h o 4 S R V m d R i J S r Y B o J r h k d e A g W C t W j I S B Y M 1 g e D n x m 3 d M a R 7 J G o x i 5 o W k L 3 m P U w J G 8 u 3 D D r B 7 S I k E f j L G Q 7 + G i z P l 5 n p 8 7 N s F p + R M g f 8 S d 0 4 K a I 6 q b 3 9 2 u h F N Q i a B C q J 1 2 3 V i 8 F K i g F P B x r l O o l l M 6 J D 0 W d t Q S U K m v X T 6 x R g f G a W L e 5 E y J Q F P 1 Z 8 T K Q m 1 H o W B 6 Q w J D P S i N x H / 8 9 o J 9 C 6 8 l M s 4 A S b p b F E v E R g i P I k E d 7 l i F M T I E E I V N 7 d i O i C K U D D B 5 U w I 7 u L L f 0 m j X H J P S + X b s 0 I l P 4 8 j i w 5 Q H h W R i 8 5 R B V 2 h K q o j i h 7 Q E 3 p B r 9 a j 9 W y 9 W e + z 1 o w 1 n 9 l H v 2 B 9 f A O 5 7 5 f 2 < / l a t e x i t > struck quark Figure 2. Jet clustering in the Breit frame using the longitudinally-invariant anti-kT (LI), Centauro, and spherically-invariant anti-kT (SI) algorithms in a DIS event simulated with Pythia 8. Each particle is illustrated as a disk with area proportional to its energy and the position corresponds to the direction of its momentum projected onto the unfolded sphere about the hard-scattering vertex. The vertical dashed lines correspond to constant θ and curved lines to constant φ. All the particles clustered into a given jet are colored the same.
New jet algorithms for DIS. The longitudinallyinvariant k T -type jet algorithms [50][51][52][53][54] use the following distance measure: where Here d ij is the distance between two particles in the event and d iB is the beam distance. Since they cluster particles in the rapidity-azimuth (y − φ) plane, they cannot form a jet enclosing then µ direction (y = −∞). One way to bypass this problem is to use sphericallyinvariant algorithms. Catani et al. first proposed to adapt spherically-invariant algorithms to DIS in ref. [55]. In this study we consider the k T -type algorithms for e + e − collisions [54,56], which have the following distance measure: where c ij = cos θ ij and c R = cos R. However, these algorithms lack the longitudinal invariance that connects the class of frames related to the Breit frame byẑ boosts, which is a crucial feature of jet clustering [49]. For example, it is important for multijet events where the parton kinematics is not constrained by x B and Q 2 , and to identify photo-production or separate the beam remnant from forward jets [57].
To solve this issue, we introduce a new jet algorithm that is longitudinally invariant along the Breit frame beam axis but yet captures the struck-quark jet. Recently, Boronat et al. [58] proposed a hybrid algorithm that suppresses γγ background in e + e − colliders. In contrast, we suggest a jet algorithm that is asymmetric in the backward and forward directions, and suggest novel studies for spherically-invariant algorithms in DIS.
Starting with the distance measure of the Cambridge/Aachen (C/A) algorithm for e + e − (i.e Eq. (6) for p = 0), we write the numerator in Eq. (6) in terms of the unit vectors along the directions of particles i and j, with c i = cos θ i and s i = sin θ i . Expanding in the very backward limit (i.e.θ i ≡ π − θ i 1) we find: We then introduce the replacements: where the function f must satisfy: f (x) = x + O(x 2 ), and p ⊥ i is the transverse momentum in the Breit frame. The termη i (which in the Breit frame is 2p ⊥ i /(n · p i )) introduces an asymmetry: in the backward region the distance between particles is given by their separation in η, which decreases as particles become closer in angle. In contrast, in the forward regionη diverges and thus prevents jets from enclosing the proton beam direction, like the anti-k T (LI) algorithm. We thus introduce the following distance measure: which defines a new class of algorithms, which we call Centauro algorithms. Two relevant choices [59] for the function f are: The Centauro algorithm is invariant along theẑ direction, but in the backward hemisphere it matches the spherically-invariant algorithms (see Eq. (6)). This feature is largely independent of the choice of f [60].
Simulation results and applications. Throughout this letter we analyze DIS events with Q > 10 GeV simulated in Pythia 8 [61] with 10 and 100 GeV electron and proton beam energies respectively [62]. We exclude neutrinos and particles with |η| > 4 or p T < 200 MeV in the laboratory frame. We use Fastjet [54] to cluster jets in the laboratory frame with the anti-k T (LI), antik T (SI), and Centauro algorithms [63]; Fig. 2 illustrates the resulting jet clustering for an exemplary Pythia 8 event. The anti-k T (LI) algorithm clusters the particles from the fragmentation of the struck-quark into four different jets [64]. In contrast, the anti-k T (SI) and Centauro algorithms cluster all of these particles into a single jet with z jet ∼ 1. The Centauro algorithm mimics the features of the anti-k T (SI) in the backward direction and the anti-k T (LI) in the forward direction. Furthermore, with the use of Centauro and anti-k T (SI) jets it is also possible to suppress the target fragmentation with a cut on z jet ∼ 0.2 − 0.7, as shown in Fig. 3 (center panel). This allows for direct studies of quark TMD observables. For the anti-k T (SI) [65] algorithm, a cut on η jet < 1 separates current and target regions (right panel of Fig. 3). This reveals the full z jet spectrum, which cannot be accessed with hadron measurements due to the contamination from the target fragmentation region [66,67]. For comparison we also show the result for the anti-k T (LI) algorithm in the left panel of Fig. 3. Fig. 4 shows the z jet and η jet distributions of inclusive jets as described above. While in the backward region (η jet < 0), the Centauro and anti-k T (SI) algorithms result in a peak at large z jet ∼ 1, the anti-k T (LI) algorithm separates that jet into several and yields a peak at smallz jet . The two peaks at z jet ∼ 1 and z jet ∼ 0 correspond to backward and mid rapidity jets. The intermediate z jet region is described in terms of jet functions and DGLAP  evolution [68][69][70][71]. The large-z jet jets probe the threshold region [72], whereas the small-z jet region is related to soft fragmentation in e + e − collisions [73][74][75][76][77] and small-x physics [78][79][80]. In Fig. 5, we show also the energy spec-  Figure 5. The NLL prediction for the Centauro algorithm including threshold effects to NLL accuracy, full DGLAP to LL, as well as a nonperturbative shape function.
trum for jets that results from a perturbative calculation supplemented with a nonperturbative shape function. As detailed in the Appendix, the spectrum results from the calculation of the factorization formula: where the formula is differential in x B , Q 2 and z jet . The function B q is the quark beam function of refs. [81,82] and D q is the quark fragmentation function to a jet at the endpoint from ref. [72]. The resummation formula at the end-point can be derived by combining the methods developed in refs. [70-72, 81, 83, 84], valid to next-toleading logarithm (NLL) including non-global effects of refs. [85,86]. We also matched to the full leading order DGLAP evolution in the moderate z jet region. Exploiting the sum-rule for the jets which demands conservation of the final state momentum, we can normalize to the leading order DIS cross-section. The PDFs were obtained from refs. [87]. The NLL uncertainty band is obtained from the envelope of varying each low scale of the renormalization group evolution by a factor of two, as well as all nonperturbative shape function scales and cutoffs for the Landau pole. We propose a measurement of z jet at HERA, which has not been done before, and the future EIC. The high-z jet region corresponds to jets with high-p T in the laboratory frame that can be measured with high precision and with an accuracy limited by the jet energy scale uncertainty, which reached 1% at HERA [44]. The measurement of the small-z jet region will be challenging because these jets correspond to jet p T up to a few GeV in the laboratory frame [88], a region that can be limited by calorimeter noise and resolution. These issues could be bypassed by defining jets with charged particles only, which would require the inclusion of track-based jet functions on the theory side [89,90].
We also propose to use Centauro jets to study quark TMDs by measuring q T = p ⊥ jet /z jet . Fig. 6 shows that the q T spectrum for z jet > 0.5 peaks at q T < Q min /4, which is ideal for TMD phenomenology. With the polarized proton beams available at the EIC, this observable would provide clean access to the Sivers PDFs. Fig. 6 also shows the q T distribution for anti-k T (SI) jets for z jet > 0.5 and 0 < z jet < 0.5. Note the latter is only possible since we can suppress the target fragmentation by requiring η jet < 1. While for z jet > 0.5 we find similar result as the Centauro jets, for z jet < 0.5 the spectrum peaks at q T ∼ Q. Novel theoretical techniques are necessary to describe this kinematic q T region of mid-rapidity jets.
In addition, the longitudinal invariance and ability to measure jets close to Born kinematics makes the Centauro algorithm an attractive option to: i) extract the strong-coupling constant from the rates of n-jets [44]; ii) enable "tag-and-probe" studies of nuclei [8]; iii) identify the background for gluon helicity and Sivers PDF studies [9,27], iv) study jet substructure and event shape observables in DIS Breit frame, v) probe TMD evolution observables that can be related to lattice QCD. We leave those studies for future work.
Conclusions. We have proposed a new jet clustering approach tailored to the study of energetic jets with low transverse momentum in DIS that relies on sphericallyinvariant algorithms and a new longitudinally-invariant algorithm that is asymmetric in the backward and forward directions, which we call Centauro. The Centauro algorithm enables novel studies of transversemomentum-dependent observables in the Breit frame. Furthermore, we find that spherically symmetric k T -type algorithms yield clean access to the soft jet fragmentation region, which also reveals a new q T regime where q T ∼ Q. The new jet algorithms introduced here are relevant for the studies of jet energy spectra, jet substructure, quark TMDs and spin physics, and cold-nuclear matter effects.
All these studies will be central for the jet physics program of the future Electron-Ion Collider. channel). We thus estimate a JES uncertainty of 2%, which we consider a conservative estimate. The resulting correlated uncertainty on the jet-energy spectrum is shown as a band in Figure 7.
The bin widths presented in Figure 7 are chosen to obtain a controllable unfolding procedure, as informed by our simulation studies. To obtain a finer binning, an improved calorimeter resolution over current specifications would be required. Alternatively, the jet-energy could be defined with charged-particles only, for which the jet-energy resolution would be better than 1%. Other techniques such as a neutral-hadron-veto could also help to improve the energy resolution [9]. We leave dedicated studies to explore these possibilities for future work.