Measurement of long-range multiparticle azimuthal correlations with the subevent cumulant method in pp and p + Pb collisions with the ATLAS detector at the CERN Large Hadron Collider

A detailed study of multiparticle azimuthal correlations is presented using pp data at √ s = 5 . 02 and 13 TeV, and p + Pb data at √ s NN = 5 . 02 TeV, recorded with the ATLAS detector at the CERN Large Hadron Collider. The azimuthal correlations are probed using four-particle cumulants c n { 4 } and ﬂow coefﬁcients v n { 4 } = ( − c n { 4 } ) 1 / 4 for n = 2 and 3, with the goal of extracting long-range multiparticle azimuthal correlation signals and suppressing the short-range correlations. The values of c n { 4 } are obtained as a function of the average number of charged particles per event, (cid:4) N ch (cid:5) , using the recently proposed two-subevent and three-subevent cumulant methods, and compared with results obtained with the standard cumulant method. The standard method is found to be strongly biasedbyshort-rangecorrelations,whichoriginatemostlyfromjetswithapositivecontributionto c n { 4 } .Thethree-subevent method, on the other hand, is found to be least sensitive to short-range correlations. The three-subevent method gives a negative c 2 { 4 } , and therefore a well-deﬁned v 2 { 4 } , nearly independent of (cid:4) N ch (cid:5) , which implies that the long-range multiparticle azimuthal correlations persist to events with low multiplicity. Furthermore, v 2 { 4 } is found to be smaller than the v 2 { 2 } measured using the two-particle correlation method, as expected for long-range collective behavior. Finally, the measured values of v 2 { 4 } and v 2 { 2 } are used to estimate the number of sources relevant for the initial eccentricity in the collision geometry. The results based on the subevent cumulant technique provide direct evidence, in small collision systems, for a long-range collectivity involving many particles distributed across a broad rapidity interval.


I. INTRODUCTION
The study of azimuthal correlations in high-energy nuclear collisions at the Relativistic Heavy Ion Collider (RHIC) and the Large Hadron Collider (LHC) has been important for understanding the multiparton dynamics of QCD in the strongly coupled nonperturbative regime.One striking observation is the long-range ridge [1][2][3][4][5] in two-particle angular correlations (2PC): an apparent collimated emission of particle pairs with small relative azimuthal angle ( φ) and large separation in pseudorapidity ( η).The ridge signature from 2PC is characterized by a Fourier decomposition of the correlation function C( φ) ∼ 1 + 2 n v 2  n cos(n φ), where v n denotes the single-particle anisotropy harmonic coefficients.The secondorder coefficient v 2 is observed to be the largest, followed by v 3 [3,4].These coefficients carry information about the collective behavior of the produced system.The ridge was first discovered in nucleus-nucleus (A+A) collisions [1][2][3][4][5][6], but was later observed in small systems such as proton-nucleus (p+A) collisions [7][8][9][10][11], light-ion-nucleus collisions [12], and more recently in proton-proton (pp) collisions [13][14][15][16].The ridge in large systems, such as central or midcentral A+A collisions, is commonly interpreted as the result of collective hydrodynamic expansion of hot and dense nuclear matter created in the overlap region of the colliding nuclei.Since the formation of an extended region of nuclear matter is not expected in small collision systems such as p+A and pp, the origin of the ridge there could be different from that formed in large collision systems.There remains considerable debate in the theoretical community as to whether the ridge in small systems is of hydrodynamic origin, like it is in A+A collisions [17], or stems from other effects such as initial-state gluon saturation [18].
An important question about the ridge is whether it involves all particles in the event (collective flow) or if it arises merely from correlations among a few particles, due to resonance decays, jets, or multijet production (nonflow).In small systems the contributions from nonflow sources, in particular from jets and dijets, are large.The extraction of a ridge signal using the 2PC method requires a large η gap and careful removal of the significant contribution from dijet production [8][9][10]14,15,19].Since collective flow is intrinsically a multiparticle phenomenon, it can be probed more directly using cumulants based on multiparticle correlation techniques [20].Azimuthal correlations involving four, six, and eight particles have been measured in p+Pb, d+Au, and pp collisions, and a significant v 2 signal has been obtained [11,19,21,22].One weakness of the standard multiparticle cumulant method is that it does not suppress adequately the nonflow correlations in small systems, which lead to a sign change of c 2 {4} at smaller values of the charged particle multiplicity, N ch [11,16,19,21].Furthermore, the magnitude of c 2 {4} and the N ch value at which the sign change occurs are found to depend sensitively on the exact definition of N ch used to categorize the events.These observations suggest that the standard cumulant method, on which several previous measurements in small systems are based, is strongly contaminated by nonflow correlations [11,19,21,22], especially in pp collisions and low N ch region.
Recently an improved cumulant method based on the correlation between particles from different subevents separated in η has been proposed to further reduce the nonflow correlations [23].The effectiveness of this method for suppressing nonflow correlations has been validated using the PYTHIA8 event generator [24], which contains only nonflow correlations.
This paper presents measurements of c 2 {4} and c 3 {4} in pp collisions at √ s = 5.02 and 13 TeV, as well as p+Pb collisions at √ s NN = 5.02 TeV.They are obtained using twoand three-subevent cumulant methods and are compared with the standard cumulant method.The c 2 {4} cumulant is converted to the corresponding v 2 coefficient and compared with the results obtained using the two-particle correlation method in Refs.[10,15] to assess the nature of the event-by-event fluctuation of the collective flow in these collisions.
The paper is organized as follows.Section II describes the framework for the standard, two-subevent and three-subevent four-particle cumulant methods used in this analysis.Details of the detector, trigger, data sets, as well as event and track selections are provided in Secs.III-V.The correlation analysis and systematic uncertainties are described in Secs.VI and VII, respectively.The measured cumulants from the three data sets are provided in Sec.VIII.A summary is given in Sec.IX.

II. FOUR-PARTICLE CUMULANTS
The multiparticle cumulant method [20] is used to extract the amplitude of long-range azimuthal correlations of particles produced in high-energy collisions.This method has the advantage of suppressing correlations from jets and dijets, instead of relying on an explicit procedure to correct v n harmonics for dijet contributions in the 2PC approach, as done in Refs.[10,14].The framework for the standard cumulant is described in Refs.[25,26], which was recently extended to the case of subevent cumulants in Ref. [23].This paper presents measurements of four-particle cumulants obtained with the standard, two-subevent, and three-subevent methods.The following discussion first describes the standard cumulant method, then describes the two-and three-subevent methods focusing on the differences from the standard method.
The cumulant methods involve the calculation of 2k-particle azimuthal correlations {2k} n , and 2k-particle cumulants, c n {2k}, for the nth-order flow harmonics.The two-or fourparticle azimuthal correlations in one event are evaluated as [23,25,26]: where " " denotes a single-event average over all pairs or quadruplets, respectively.The averages from Eqs. ( 1) and ( 2) are expanded into per-particle normalized flow vectors q n;l and factors τ l with l = 1,2, . . .: where the sum runs over all M particles in the event and w j is a weight assigned to the j th particle.This weight is constructed to correct for both detector nonuniformity and tracking inefficiency as explained in Sec.VI.For unit weight w j = 1, then q mn;m = q mn , and τ l = 1/M l .The two-and four-particle cumulants are obtained from the azimuthal correlations as: where " " represents a weighted average of {2k} n over an event ensemble.In the absence of nonflow correlations, c n {2k} reflects the moments of the distribution of the flow coefficient v n : If harmonic coefficients do not fluctuate event by event, Eq. ( 6) n , and c n {4} flow is expected to be negative.Therefore, the flow coefficients from two-and four-particle cumulants are defined as: In the standard cumulant method described so far, all 2k-particle multiplets involved in {2k} n are selected using the entire detector acceptance.To further suppress the nonflow correlations that typically involve particles emitted within a localized region in η, the particles can be grouped into several subevents, each covering a nonoverlapping η interval [23].The multiparticle correlations are then constructed by correlating particles between different subevents, further reducing nonflow correlations.This analysis uses the subevent cumulant methods based on two and three subevents as described in the following.
In the two-subevent cumulant method, the entire event is divided into two subevents, labeled as a and b, for example, ac-024904-2 cording to −η max < η a < 0 and 0 < η b < η max , where η max = 2.5 is the maximum η used in the analysis and corresponds to the ATLAS detector acceptance for charged particles.The per-event two-and four-particle azimuthal correlations are then evaluated as: where the superscript or subscript a (b) indicates particles chosen from the subevent a (b).Here the four-particle cumulant is defined as: The two-subevent method should suppress correlations within a single jet (intrajet correlations), since each jet usually emits particles into only one subevent.
In the three-subevent cumulant method, the event is divided into three subevents a, b, and c each covering a unique η range, for example −η max < η a < −η max /3, |η b | < η max /3, and η max /3 < η c < η max .The four-particle azimuthal correlations and cumulants are then evaluated as: where {2} n a|b and {2} n a|c are two-particle correlators defined as in Eq. (8).Since the two jets in a dijet event usually produce particles in at most two subevents, the three-subevent method further suppresses nonflow contributions from interjet correlations associated with dijets.To enhance the statistical precision, the η range for subevent a is also interchanged with that for subevent b or c, and the resulting three c 2a|b,c n {4} values are averaged to obtain the final result.

III. DETECTOR AND TRIGGER
The ATLAS detector [27] provides nearly full solid-angle coverage around the collision point with tracking detectors, calorimeters, and muon chambers, and is well suited for measurement of multiparticle correlations over a large pseudorapidity range. 1 The measurements were performed primarily 1 ATLAS typically uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the z axis along the beam pipe.The x axis points from the IP to the center of the LHC ring, and the y axis points upward.Cylindrical coordinates (r,φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe.By default, the pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2).However, for asymmetric p+Pb or Pb+p collisions, the −z direction is always defined as the direction of the Pb beam.
The ATLAS trigger system [30] consists of a Level-1 (L1) trigger implemented using a combination of dedicated electronics and programmable logic, and a high-level trigger (HLT) implemented in processors.The HLT reconstructs charged-particle tracks using methods similar to those applied in the offline analysis, allowing high-multiplicity track (HMT) triggers that select events based on the number of tracks with p T > 0.4 GeV associated with the vertex with the largest number of tracks.The different HMT triggers also apply additional requirements on either the transverse energy (E T ) in the calorimeters or on the number of hits in the MBTS at L1, and on the number of charged-particle tracks reconstructed by the HLT.The pp and p+Pb data were collected using a combination of the minimum-bias and HMT triggers.More details of the triggers used for the pp and p+Pb data can be found in Refs.[15,31] and Refs.[10,32], respectively.

IV. DATA SETS AND MONTE CARLO SIMULATIONS
This analysis uses integrated luminosities of 28 nb −1 of p+Pb data recorded at √ s NN = 5.02 TeV, 0.17 pb −1 of pp data recorded at √ s = 5.02 TeV, and 0.9 pb −1 of pp data recorded at √ s = 13 TeV, all taken by the ATLAS experiment at the LHC.The p+Pb data were mainly collected in 2013, but also include 0.3 nb −1 data collected in November 2016, which increases the number of events at moderate multiplicity (see Sec. V).During both p+Pb runs, the LHC was configured with a 4 TeV proton beam and a 1.57TeV per-nucleon Pb beam that together produced collisions at √ s NN = 5.02 TeV, with a rapidity shift of 0.465 of the nucleon-nucleon center-ofmass frame towards the proton beam direction relative to the ATLAS rest frame.The direction of the Pb beam is always defined to have negative pseudorapidity.The 5.02 TeV pp data were collected in November 2015.The 13 TeV pp data were collected during several special low-luminosity runs of the LHC in 2015 and 2016.
Monte Carlo (MC) simulated event samples are used to determine the track reconstruction efficiency (Sec.V).The 13 TeV and 5.02 TeV pp data were simulated by the PYTHIA8 MC event generator [24] using the A2 set of tuned parameters with MSTW2008LO parton distribution functions [33].The HIJING event generator [34] was used to produce p+Pb collisions with the same energy and the same boost of the center-of-mass system as in the data.The detector response was simulated using GEANT4 [35,36] with detector conditions matching those during the data taking.The simulated events 024904-3 and data events are reconstructed with the same algorithms, including those for track reconstruction.

V. EVENT AND TRACK SELECTION
The offline event selection for the p+Pb and pp data requires at least one reconstructed vertex with its longitudinal position satisfying |z vtx | < 100 mm.The vertex is required to have at least two associated tracks with p T > 0.4 GeV.The mean collision rate per bunch crossing μ was approximately 0.03 for the 2013 p+Pb data, 0.001-0.006for the 2016 p+Pb data, 0.02-1.5 for 5.02 TeV pp data, and 0.002-0.8for the 13 TeV pp data.In order to suppress additional interactions in the same bunch crossing (referred to as pileup) in pp collisions, events containing additional vertices with at least four associated tracks are rejected.In p+Pb collisions, events with more than one good vertex, defined as any vertex for which the scalar sum of the p T of the associated tracks is greater than 5 GeV, are rejected.The remaining pileup events are further suppressed by using the signal in the ZDC on the Pb-fragmentation side.This signal is calibrated to the number of detected neutrons (N n ) by using the location of the peak corresponding to a single neutron.The distribution of N n in events with pileup is broader than that for the events without pileup.Hence a simple requirement on the ZDC signal distribution is used to further suppress events with pileup, while retaining more than 98% of the events without pileup.The impact of residual pileup, at a level of 10 −3 , is studied by comparing the results obtained from data with different μ values.
Charged-particle tracks and collision vertices are reconstructed using the same algorithms and methods applied in previous minimum-bias pp and p+Pb measurements [10,14,31].For the 2013 p+Pb analysis, tracks are required to have a p T -dependent minimum number of hits in the SCT.The transverse (d 0 ) and longitudinal (z 0 sin θ ) impact parameters of the track relative to the primary vertex are both required to be less than 1.5 mm.A more detailed description of the track selection for the 2013 p+Pb data can be found in Ref. [10].
For all the data taken since the start of Run 2, the track selection criteria make use of the IBL, as described in Refs.[14,31].Furthermore, the requirements of |d BL 0 | < 1.5 mm and |z 0 sin θ | < 1.5 mm are applied, where d BL 0 is the transverse impact parameter of the track relative to the beam line (BL).
The cumulants are calculated using tracks passing the above selection requirements, and having |η| < 2.5 and 0.3 < p T < 3 GeV or 0.5 < p T < 5 GeV.These two p T ranges are chosen because they were often used in previous ridge measurements at the LHC [11,[14][15][16]19].However, to count the number of reconstructed charged particles for event-class definition (denoted by N rec ch ), tracks with p T > 0.4 GeV and |η| < 2.5 are used for compatibility with the requirements in the HLT selections described above.Due to different trigger requirements, most of the p+Pb events with N rec ch > 150 are provided by the 2013 data set, while the 2016 data set provides most of the events at lower N rec ch .The efficiency of the combined track reconstruction and selection requirements in data is estimated using the MC samples reconstructed with the same tracking algorithms and the same track selection requirements.Efficiencies, (η,p T ), are evaluated as a function of track η, p T and the number of reconstructed charged-particle tracks, but averaged over the full range in azimuth.For all collision systems, the efficiency increases by about 4% as p T increases from 0.3 GeV to 0.6 GeV.Above 0.6 GeV, the efficiency is independent of p T and reaches 86% (72%) at η ≈ 0 (|η| > 2) for pp collisions and 83% (70%) for p+Pb collisions, respectively.The efficiency is independent of the event multiplicity for N rec ch > 40.For lower-multiplicity events the efficiency is smaller by up to a few percent due to broader d BL 0 and z 0 sin θ distributions.The rate of falsely reconstructed charged-particle tracks is also estimated and found to be negligibly small in all data sets.This rate decreases with increasing p T , and even at the lowest transverse momenta of 0.2 GeV it is below 1% of the total number of tracks.Therefore, there is no correction for the presence of these tracks in the analysis.
In the simulated events, the reconstruction efficiency reduces the measured charged-particle multiplicity relative to the generated multiplicity for primary charged particles.The multiplicity correction factor b is used to correct N rec ch to obtain the efficiency-corrected number of charged particles per event, The value of the correction factor is found to be independent of N rec ch in the range used in this analysis.Its value and the associated uncertainties are b = 1.29 ± 0.05 for the 2013 p+Pb collisions and b = 1.18 ± 0.05 for Run-2 p+Pb and pp collisions [37].Both c n {4} and v n {4} are then studied as a function of N ch .
In the second step, the correlators {2k} n are averaged over events with the same N sel ch , the number of reconstructed charged particles in a given p T range, to obtain {2k} n and c n {2k} from Eqs. ( 4), (10), and (12).In a previous study [16], it was observed that the c n {2k} values varied with the exact definition of N sel ch .This is because different definitions of N sel ch lead to different multiplicity fluctuations and therefore different nonflow correlations associated with these multiplicity fluctuations.The observed dependence of c n {2k} on the definition of N sel ch has been attributed to the change in the nonflow correlations when N sel ch is changed [16].
In order to further test the sensitivity of c n {2k} to the exact definition of N sel ch , four different p T requirements are used to define N sel ch as follows: when {2k} n is calculated in the range 0.3 < p T < 3 GeV, N sel ch is evaluated in four different track p T ranges: 0.3 < p T < 3 GeV, p T > 0.2 GeV, p T > 0.4 GeV, and p T > 0.6 GeV.When {2k} n is calculated in 0.5 < p T < 5 GeV, N sel ch is evaluated in four different track p T ranges: 0.5 < p T < 5 GeV, p T > 0.2 GeV, p T > 0.4 GeV, and p T > 0.6 GeV.In each case, the c n {2k} value is first calculated for events with the same N sel ch ; the c n {2k} values are 024904-4 then combined in the broader N sel ch range of the event ensemble to obtain statistically significant results.
In the third step, the c n {2k} and v n {2k} values obtained for a given N sel ch are mapped to a given N rec ch , the average number of reconstructed charged particles with p T > 0.4 GeV.The mapping procedure is necessary so that c n {2k} obtained for different N sel ch can be compared using a common x axis defined by N rec ch .The N rec ch value is then converted to N ch , the efficiency-corrected average number of charged particles with p T > 0.4 GeV, as discussed in Sec.V.
In order to account for detector inefficiencies and nonuniformity, particle weights used in Eq. ( 3) are defined as: The additional weight factor d(φ,η) accounts for nonuniformities in the azimuthal acceptance of the detector as a function of η.All reconstructed charged particles with p T > 0.2 GeV are entered into a two-dimensional histogram N(φ,η), and the weight factor is then obtained as d(φ,η) ≡ N (η) /N (φ,η), where N(η) is the track density averaged over φ in the given η bin.This procedure removes most φ-dependent nonuniformity from track reconstruction for any azimuthal correlation analysis [16].

VII. SYSTEMATIC UNCERTAINTIES
The main sources of systematic uncertainty are related to the detector azimuthal nonuniformity, track selection, track reconstruction efficiency, trigger efficiency, and pileup.Most of the systematic uncertainties enter the analysis through the particle weights, Eq. ( 13).Since c 2 {4} often changes sign in the low N ch region, the absolute uncertainties (instead of relative uncertainties) in c 2 {4} are determined for each source.The uncertainties are typically of the order of 10 −6 , which translates into an absolute uncertainty of 4 √ 10 −6 = 0.032 for zero flow signal.
The effect of detector azimuthal nonuniformity is accounted for using the weight factor d(φ,η).The impact of the reweighting procedure is studied by fixing the weight to unity and repeating the analysis.The results are mostly consistent with the nominal results within statistical uncertainties.As a cross check, the multiparticle correlations are calculated using a mixed-event procedure, where each particle in a 2k multiplet is selected from a different event with similar N rec ch (| N rec ch | < 10) and similar z vtx (| z vtx | < 10 mm).The particle weights defined in Eq. ( 13) are applied for each particle forming the mixed event.The c 2 {4} signal obtained from the mixed events is less than 0.2 × 10 −6 in all data sets.
The systematic uncertainty associated with the track selection is estimated by tightening the |d 0 | and |z 0 sin θ | requirements.For each variation, the tracking efficiency is reevaluated and the analysis is repeated.The maximum differences from the nominal results are observed to be less than 0.3 × 10 −6 , 0.2 × 10 −6 , and 0.1 × 10 −6 in 5.02 TeV pp, 13 TeV pp, and p+Pb collisions, respectively.
Previous measurements indicate that the azimuthal correlations (both the flow and nonflow components) have a strong dependence on p T , but a relatively weak dependence on η [10,15].Therefore, p T -dependent systematic effects in the track reconstruction efficiency could affect c n {2k} and v n {2k} values.The uncertainty in the track reconstruction efficiency is mainly due to differences in the detector conditions and material description between the simulation and the data.The efficiency uncertainty varies between 1% and 4%, depending on track η and p T [15,16].Its impact on multiparticle cumulants is evaluated by repeating the analysis with the tracking efficiency varied up and down by its corresponding uncertainty as a function of p T .For the standard cumulant method, which is more sensitive to jets and dijets, the evaluated uncertainty amounts to (0.1-1.5)×10 −6 in pp collisions and less than 0.3 × 10 −6 in p+Pb collisions for N ch > 50.For the two-and three-subevent methods, the evaluated uncertainty is typically less than 0.3 × 10 −6 for most of the N ch ranges.
Most events used in the analysis are collected with the HMT triggers with several N rec ch thresholds.In order to estimate the possible bias due to trigger inefficiency as a function of N ch , the offline N rec ch requirements are changed such that the HMT trigger efficiency is at least 50% or 80%.The results are obtained independently for each variation.These results are found to be consistent with each other for the two-and three-subevent methods, and show a small difference for the standard cumulant method in the low N ch region.The nominal analysis is performed using the 50% efficiency selection and the differences between the nominal results and those from the 80% efficiency selection are used as a systematic uncertainty.The change amounts to (0.1-0.7)×10 −6 .
In this analysis, a pileup rejection criterion is applied to reject events containing additional vertices.In order to check the impact of residual pileup, the analysis is repeated without the pileup rejection criterion, and no difference is observed.For the 5.02 and 13 TeV pp data sets, which have relatively high pileup, the data is divided into two samples based on the μ value: μ > 0.4 and μ < 0.4, and the results are compared.The average μ values differ by a factor of two between the two samples, and the difference in c 2 {4} is found to be less than 0.5 × 10 −6 .
To check the impact of dijet events, where both jets have pseudorapidities close to the boundaries of relevant subevent regions, the three-subevent cumulants are calculated by requiring a η = 0.5 gap between the adjacent regions.The results are found to be consistent with the nominal result.
The systematic uncertainties from different sources are added in quadrature to determine the total systematic uncertainty.The uncertainty is (0.1-1)×10 −6 for two-and threesubevent methods in the region N ch > 50, where there is a negative c 2 {4} signal.The total systematic uncertainty for the standard method is typically about a factor of two larger.
The systematic uncertainty studies described above are also carried out for c 3 {4}, and the absolute uncertainties are found to be smaller than those for c 2 {4}, presumably because c 3 {4} is less sensitive to the influence from dijets.

A. Dependence on the event-class definition
This section presents the sensitivity of c 2 {4} to N sel ch , which defines the event class used to calculate {2} n and {4} n in Eqs. ( 10)-( 12).The discussion is based on results obtained from the 13 TeV pp data, but the observations for the 5.02 TeV pp and p+Pb data are qualitatively similar.
Figure 1 shows the c 2 {4} values obtained using the standard method for four event-class definitions based on N sel ch .The c 2 {4} values changes dramatically as the event-class definition is varied, which, as points out in Ref. [23], reflects different amount of nonflow fluctuations associated with different N sel ch .The c 2 {4} values for 0.3 < p T < 3 GeV become negative when the reference N sel ch is obtained for p T > 0.4 GeV or higher, but the four cases do not converge to the same c 2 {4} values.On the other hand, c 2 {4} values for 0.5 < p T < 5 GeV are always positive, independent of the definition of N sel ch .These behaviors suggest that the c 2 {4} values from the standard method are strongly influenced by nonflow effects in all N ch and p T ranges.Therefore the previously observed negative c 2 {4} in pp collisions for 0.3 < p T < 3 GeV and N sel ch with p T > 0.4 GeV [19] may be dominated by nonflow correlations instead of long-range collective flow.
Figure 2 shows that the c 2 {4} values calculated using the two-subevent method are closer to each other among different event-class definitions.The c 2 {4} values decrease gradually with N ch and become negative for N ch > 70 when c 2 {4} is calculated in the range 0.3 < p T < 3 GeV range and for N ch > 150 when c 2 {4} is calculated in the range 0.5 < p T < 5 GeV.Therefore, the c 2 {4} values from the two-subevent method are more sensitive to long-range ridge correlations, but nevertheless may still be affected by nonflow effects, especially in the low N ch region and higher p T .
Figure 3 shows the results from the three-subevent method.For most of the N ch range, the c 2 {4} values are negative, i.e., having the sign expected for long-range ridge correlations.The c 2 {4} values show some sensitivity to the definition of the reference N sel ch but they are close to each other for all definitions in the region N ch > 100.This suggests that the residual nonflow effects may still be important at small N ch , but are negligible at N ch > 100.It is also observed that the c 2 {4} values for 0.5 < p T < 5 GeV are more negative than those for 0.3 < p T < 3 GeV, which is consistent with the observation that the v 2 value associated with the long-range collectivity increases with p T [10,15].
Given the relatively small dependence of c 2 {4} on the reference N sel ch in the three-subevent method, the remaining discussion focuses on cases where the reference N sel ch is calculated in the same p T ranges as those used for calculating c 2 {4}, i.e., 0.3 < p T < 3 GeV and 0.5 < p T < 5 GeV.show direct comparisons of the results for the standard, two-subevent, and three-subevent methods for pp collisions at √ s = 13 TeV, pp at √ s = 5.02 TeV, and p+Pb collisions at √ s NN = 5.02 TeV, respectively.The results from 5.02 TeV pp collisions are qualitatively similar to those from the 13 TeV pp collisions, i.e., the c 2 {4} values are smallest for the three-subevent method and largest for the standard method.The same hierarchy between the three methods is also observed in p+Pb collisions, but only for the N ch < 100 region, suggesting that nonflow effects in p+Pb collisions are much smaller than those in pp collisions at comparable N ch .

B. Comparison between different cumulant methods
In p+Pb collisions, all three methods give consistent results for N ch > 100.Furthermore, the three-subevent method values calculated for charged particles with 0.5 < p T < 5 GeV using the standard cumulants (left) and the three-subevent method (right) compared between 5.02 TeV pp, 13 TeV pp, and 5.02 TeV p+Pb.The event averaging is performed for N sel ch calculated for the same p T range, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.
gives negative c 2 {4} values in most of the measured N ch range.
The comparison of the c 2 {4} values between the three data sets, for the standard and the three-subevent methods, is shown in Figs.7 and 8.The large positive c 2 {4} values observed in the small N ch region in the standard method are likely due to nonflow correlations, since this trend is absent when using the three-subevent cumulant method.In p+Pb collisions, the absolute value of c 2 {4} seems to become smaller for N ch > 200.
The same analysis is performed for the third-order harmonics.Figures 9 and 10 compare the c 3 {4} values between the three data sets for the standard cumulant method and the threesubevent method.The c 3 {4} values from the three-subevent method are close to zero in all three systems.For the standard method, the positive c 3 {4} values in the small N ch region indicate the influence of nonflow correlations, but the influence is not as strong as that for c 2 {4}.
Figure 11 shows the c 3 {4} values from p+Pb collisions in the two p T ranges, obtained with the three-subevent method; The harmonic flow coefficients v 2 {4} can be obtained from the measured values of c 2 {4} according to Eq. (7). Figure 12 shows the v 2 {4} values for charged particles with 0.3 < p T < 3 GeV calculated using the three-subevent method in the three data sets.Results for the higher p T range (0.5 < p T < 5 GeV) are presented in Fig. 13 12.The v 2 {4} values calculated for charged particles with 0.3 < p T < 3 GeV using the three-subevent method in 5.02 TeV pp (left), 13 TeV pp (middle), and 5.02 TeV p+Pb collisions (right).They are compared to v 2 obtained from the 2PC analyses [10,15] where the nonflow effects are removed by a template fit procedure (solid circles) or with a fit after subtraction with a ZYAM assumption (peripheral subtraction, open circles).The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.
The values of v 2 {4} presented in Figs. 12 and 13 are also compared to the values of v 2 {2} obtained from the 2PC measurements [10,15] where the nonflow effects are estimated using low-multiplicity events ( N ch < 20) and then subtracted.The subtraction was performed either by a template fit, which includes the pedestal level from the N ch < 20 events, or by a peripheral subtraction, which sets the pedestal level by a zero-yield at minimum (ZYAM) procedure [6].The peripheral subtraction explicitly assumes that the most peripheral events do not contain any long-range correlations [15], and so v 2 is forced to be zero at the corresponding N ch value, which biases v 2 to a lower value in other multiplicity ranges.
D. Dependence on the number of sources in the initial state Figures 12 and 13 show that the v 2 {4} values are smaller than the v 2 {2} values extracted using the template-fit method in both the pp and p+Pb collisions.In various hydrodynamic models for small collision systems [38,39], this difference can be interpreted as the influence of event-by-event flow fluctuations associated with the initial state, which is closely related to the effective number of sources N s for particle production in the transverse density distribution of the initial state [39]: Figure 14 shows the extracted values of N s as a function of N ch in 13 TeV pp and 5.02 p+Pb collisions, estimated using charged particles with 0.3 < p T < 3 GeV and 0.5 < p T < 5 GeV.It is observed that the N s value increases with N ch in p+Pb collisions, reaching N s ∼ 20 in the highest multiplicity class, and it is consistent between the two p T ranges.
In the model framework in Refs.[38,39], the values of |c 2 {4}| and v 2 {4} are expected to decrease for large N s , which is compatible with the presented results.The slight decreases of |c 2 {4}| shown in Figs.7 and 8 for p+Pb collisions are compatible with the model predictions.The results for 13 TeV pp collisions cover a limited N ch range compared to p+Pb, but agree with p+Pb collisions in this range.FIG. 13.The v 2 {4} values calculated for charged particles with 0.5 < p T < 5 GeV using the three-subevent method in 5.02 TeV pp (left), 13 TeV pp (middle), and 5.02 TeV p+Pb collisions (right).They are compared to v 2 obtained from the 2PC analyses [10,15] where the nonflow effects are removed by a template fit procedure (solid circles) or with a fit after subtraction with a ZYAM assumption (peripheral subtraction, open circles).The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.[38,39] and Eq. ( 14) in 13 TeV pp and 5.02 TeV p+Pb collisions, for charged particles with 0.3 < p T < 3 GeV (left) and 0.5 < p T < 5 GeV (right).The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

IX. SUMMARY
Measurements of the four-particle cumulants c n {4} and harmonic flow coefficients v n {4} for n = 2 and 3 are presented using 0.17 pb −1 of pp data at √ s = 5.02 TeV, 0.9 pb −1 of pp data at √ s = 13 TeV, and 28 nb −1 p+Pb of data at √ s NN = 5.02 TeV.These measurements were performed with the ATLAS detector at the LHC.The c 2 {4} values are calculated using the standard cumulant method and the recently proposed two-subevent and three-subevent methods.They are all presented as a function of the average number of charged particles with p T > 0.4 GeV, N ch .It is found that the c 2 {4} value from the standard method is sensitive to the choice of particles used to form the event classes used for averaging.This suggests that the previous c 2 {4} measurement in pp collisions [16,19], based on the standard method, may be dominated by nonflow correlations instead of a long-range collective flow correlation.In general, it is easy to obtain incorrect results from the standard cumulant method, depending on the nature of the nonflow fluctuations associated with the event class chosen for the analysis.
On the other hand, the sensitivity of c 2 {4} on event class definition is greatly reduced in the two-subevent method and is almost fully removed in the three-subevent method, demonstrating that the three-subevent method is more robust against nonflow effects.Similarly, the values of c 3 {4} are found to differ in the three data sets using the standard method, but are consistent with each other and much closer to zero using the three-subevent method.This gives confidence that nonflow correlations make a much smaller contribution to the three-subevent results, and that this method is more appropriate for studying long-range collective behavior than the standard cumulant method.
The three-subevent method provides a measurement of c 2 {4} that is negative in all three data sets over a broad range of N ch .The magnitude of c 2 {4} increases with p T and is nearly independent of N ch but in p+Pb collisions the values become smaller at high multiplicities.These results provide direct ev-idence for the presence of long-range multiparticle azimuthal correlations in broad N ch ranges in pp and p+Pb collisions, and these long-range multiparticle correlations persist even in events with rather low multiplicity of N ch ∼ 40.The c 3 {4} values are consistent with zero in pp collisions, but are systematically below zero in p+Pb collisions, compatible with the presence of significant long-range multiparticle triangular flow in p+Pb collisions.
The single-particle harmonic coefficient v 2 {4} = (−c 2 {4}) 1/4 is calculated and compared with v 2 {2} obtained previously using the two-particle correlation method, where the nonflow contributions were estimated and subtracted.The magnitude of v 2 {4} is smaller than that for v 2 {2}, as expected for a long-range final-state hydrodynamic collective effect.The ratio of v 2 {4} to v 2 {2} is used, in a model-dependent framework, to infer the number of particle-emitting sources in the initial-state geometric configuration.The number of sources extracted within this framework is found to increase with N ch in p+Pb collisions.
The subevent cumulant technique and the new results provide direct evidence that the ridge is indeed a long-range collective phenomenon involving many particles distributed across a broad rapidity interval.The results of v 2 {4} and its dependence on p T and N ch , largely free from nonflow effects, can be used to understand the space-time dynamics and the properties of the medium created in small collision systems.

FIG. 1 .FIG. 2 .
FIG.1.The c 2 {4} values calculated for charged particles with 0.3 < p T < 3 GeV (left) and 0.5 < p T < 5 GeV (right) with the standard cumulant method from the 13 TeV pp data.The event averaging is performed for N sel ch calculated for various p T selections as indicated in the figure, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

FIG. 3 .FIG. 4 .
FIG.3.The c 2 {4} values calculated for charged particles with 0.3 < p T < 3 GeV (left) and 0.5 < p T < 5 GeV (right) with the three-subevent cumulant method from the 13 TeV pp data.The event averaging is performed for N sel ch calculated for various p T selections as indicated in the figure, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

FIG. 5 .FIG. 6 .
FIG.5.The c 2 {4} values calculated for charged particles with 0.3 < p T < 3 GeV (left) and 0.5 < p T < 5 GeV (right) compared for the three cumulant methods from the 5.02 TeV pp data.The event averaging is performed for N sel ch calculated for the same p T range, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The dashed line indicates the c 2 {4} value corresponding to a 4% v 2 signal.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

FIG. 7 .FIG. 8 .
FIG.7.The c 2 {4} values calculated for charged particles with 0.3 < p T < 3 GeV using the standard cumulants (left) and the three-subevent method (right) compared between 5.02 TeV pp, 13 TeV pp, and 5.02 TeV p+Pb.The event averaging is performed for N sel ch calculated for the same p T range, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

FIG. 9 .FIG. 10 .
FIG. 9.The c 3 {4} values calculated for charged particles with 0.3 < p T < 3 GeV using the standard cumulants (left) and the three-subevent method (right) compared between 5.02 TeV pp, 13 TeV pp, and 5.02 TeV p+Pb.The event averaging is performed for N sel ch calculated for the same p T range, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.

FIG. 11 .
FIG. 11.The c 3 {4} values calculated for charged particles with 0.3 < p T < 3 GeV (left) or 0.5 < p T < 5 GeV (right) with the three-subevent cumulant method for the p+Pb data.The event averaging is performed for N sel ch calculated for various p T selections as indicated in the figure, which is then mapped to N ch , the average number of charged particles with p T > 0.4 GeV.The dashed line indicates the c 3 {4} value corresponding to a 2% v 3 signal.The error bars and shaded boxes represent the statistical and systematic uncertainties, respectively.