Centrality dependence of particle production in p-Pb collisions at $\sqrt{s_{\rm NN} }$= 5.02 TeV

We report measurements of the primary charged particle pseudorapidity density and transverse momentum distributions in p-Pb collisions at $\sqrt{s_{\rm NN}}$ = 5.02 TeV, and investigate their correlation with experimental observables sensitive to the centrality of the collision. Centrality classes are defined using different event activity estimators, i.e. charged particle multiplicities measured in three disjunct pseudorapidity regions as well as the energy measured at beam rapidity (zero-degree). The procedures to determine the centrality, quantified by the number of participants ($N_{\rm part}$), or the number of nucleon-nucleon binary collisions ($N_{\rm coll}$), are described. We show that, in contrast to Pb-Pb collisions, in p-Pb collisions large multiplicity fluctuations together with the small range of participants available, generate a dynamical bias in centrality classes based on particle multiplicity. We propose to use the zero-degree energy, which we expect not to introduce a dynamical bias, as an alternative event-centrality estimator. Based on zero-degree energy centrality classes, the $N_{\rm part}$ dependence of particle production is studied. Under the assumption that the multiplicity measured in the Pb-going rapidity region scales with the number of Pb-participants, an approximate independence of the multiplicity per participating nucleon measured at mid-rapitity of the number of participating nucleons is observed. Furthermore, at high-$p_{\rm T}$ the p-Pb spectra are found to be consistent with the pp spectra scaled by $N_{\rm coll}$ for all centrality classes. Our results represent valuable input for the study of the event activity dependence of hard probes in p-Pb collision and, hence, help to establish baselines for the interpretation of the Pb-Pb data.


Introduction
Proton-lead collisions are an essential component of the heavy ion programme at the Large Hadron Collider (LHC) [1]. Measurements of benchmark processes in p-Pb collisions serve as an important baseline for the understanding and the interpretation of the nucleus-nucleus data. These measurements allow one to disentangle hot nuclear matter effects which are characteristic of the formation of the quark-gluon plasma (QGP) from cold nuclear matter effects. The latter are the effects due to the presence of the nuclei themselves and not the QGP, for example k T broadening, nuclear modification of parton densities, and partonic energy loss in cold nuclear matter.
Of particular interest are studies of nuclear effects on parton scatterings at large momentum transfer (hard processes). To this end, the nuclear modification factor defined as the ratio of particle or jet transverse momentum (p T ) spectra in minimum bias p-Pb to those in pp collisions scaled by the average number of binary p-nucleon (p-N) collisions N coll is measured [2]. The latter is given by the ratio of p-N and p-Pb inelastic cross-sections times the mass number A. In the absence of nuclear effects, the nuclear modification factor is expected to be unity. In heavy ion collisions, binary scaling is found to hold in measurements of prompt photons [3] and electroweak probes [4,5], which do not strongly interact with the medium. The observation of binary scaling in p-Pb demonstrates that the strong suppression of hadrons [6], jets [7] and heavy flavour hadrons [8,9] seen in Pb-Pb collisions is due to strong final state effects. Centrality dependent measurements of the nuclear modification factor R pPb (p T , cent), defined as R pPb (p T , cent) = dN pPb cent /dp T N cent coll dN pp /dp T , require the determination of the average N cent coll for each centrality class. Moreover, it has been recognised that the study of p-Pb collisions is also interesting in its own right. Several measurements [10][11][12][13] of particle production in the low and intermediate transverse momentum region clearly show that p-Pb collisions cannot be explained by an incoherent superposition of pp collisions. Instead the data are compatible with the presence of coherent [14] and collective [15] effects. Their strength increases with multiplicity indicating a strong collision geometry dependence. In order to corroborate this hypothesis a more detailed characterisation of the collision geometry is needed. The Glauber model [16] is generally used to calculate geometrical quantities of nuclear collisions (A-A or p-A). In this model, the impact parameter b controls the average number of participating nucleons (hereafter referred as "participants" or also "wounded nucleons" [17]), N part and the corresponding number of collisions N coll . It is expected that variations of the amount of matter overlapping in the collision region will change the number of produced particles, and parameters such as N part and N coll have traditionally been used to describe those changes quantitatively, and to relate them to pp collisions.
Using the Glauber model one can calculate the probability distributions π ν (ν), where ν stands for N part or N coll . Since ν cannot be measured directly it has to be related via a model to an observable M, generally called centrality estimator, via the conditional probability P(M|ν) to observe M for a given ν. For each collision system and centre of mass energy, the model has to be experimentally validated by comparing the measured probability distribution P meas (M) to the one calculated from the convolution P calc (M) = ∑ ν P(M|ν)π ν (ν). Once the model has been validated, for each event class defined by an M-interval the average ν is calculated. In order to unambigously determine ν, one chooses observables whose mean values depend monotonically on ν. Note that in p(d)-A collisions the impact parameter is only loosely correlated to ν. Hence, although one uses traditionally the term centrality to refer to these measurements, the relevant parameters are N part and N coll .
The procedure described above can be easily extended to several estimators. Of particular interest are estimators from kinematic regions that are causally disconnected after the collision. The measurement of a finite correlation between them unambiguously establishes their connection to the common collision geometry. Typically these studies are performed with observables from well separated pseudorapidity (η) intervals, e.g. at zero-degree (spectators, slow-nucleons, deuteron break-up probability) and multiplicity in the rapidity plateau.
The use of centrality estimators in p-A collisions based on multiplicity or summed energy in certain pseudo-rapidity intervals is motivated by the observation that they show a linear dependence with N part or N coll . This is also in agreement with models for the centrality dependence of particle production (e.g. the Wounded Nucleon Model [17], or also string models like FRITIOF [18]). The total rapidity integrated multiplicity of charged particles measured in hadron-nucleus collisions (N h−A ch ) at centre of mass energies ranging from 10 to 200 GeV (E178 [19], PHOBOS [20]) is consistent with a linear dependence on N part : N h−A ch = N pp ch · N part /2. The ratio of particle pseudorapidity (η) densities in d-Au and pp collisions exhibits a dependence on η, which implies that the scaling behaviour has a strong rapidity dependence with an approximate N part -scaling at η = 0 and an approximate scaling with the number of target participants (N target part = N part − 1) in the Au-going direction [20]. In d-Au collisions at RHIC ( √ s NN = 200 GeV), the PHENIX and STAR collaborations [21,22] have used the multiplicity measured in an η-interval of width 0.9 centered at η ≈ −3.5 (Au-going direction) as a centrality estimator. The multiplicity distribution has been successfully described by the Glauber model assuming N target part -scaling. Finally, in centrality averaged p-Pb collisions at the LHC ( √ s NN = 5.02 TeV) the primary charged particle pseudorapidity density at η = 0 scaled by the mean number of participants is found to be consistent with the corresponding value in pp collisions interpolated to the same √ s NN [23].
At RHIC, the deuteron dissociation probability can be accurately modelled by a Glauber calculation and measured using the zero degree calorimeters in the d-direction [21,24]. The mean number of participants has been determined for centrality classes obtained with the multiplicity estimator described above and used to calculate the deuteron break-up probability. Inferred and measured probabilities are consistent, demonstrating the correlation between collision geometry and multiplicity, and providing a stringent test for the N part determination.
Since for example hard scatterings can significantly contribute to the overall particle multiplicity, correlations between high-p T particle production and bulk multiplicity can also be induced after the collisions and, hence, they are not only related to the collision geometry. Therefore, the use of N coll from the Glauber model to scale cross-sections of hard processes from pp to p-A has to undergo the same scrutiny as the correlation of the centrality estimator to the collision geometry. This is necessary also due to the enhanced role of multiplicity fluctuations in p-A. While the average of centrality estimators vary monotonically with ν, for a full description of the conditional probability P(M|ν) fluctuations of M for a fixed ν have to be taken into account. In Pb-Pb collisions, these multiplicity fluctuations have little influence on the centrality determination. The range of ν is large and P(M|ν) converges with increasing ν rapidly to a Gaussian with small width relative to the the range of ν. However, in p-Pb collisions, the range of multiplicities used to select a centrality class is of similar magnitude as the fluctuations, with the consequence that a centrality selection based on multiplicity may select a biased sample of nucleon-nucleon collisions (for a discussion of this effect in d+Au see [21]).
In essence, by selecting high (low) multiplicity one chooses not only large (small) average N part , but also positive (negative) multiplicity fluctuations leading to deviations from the binary scaling of hard processes. These fluctuations are partly related to qualitatively different types of collisions. High multiplicity nucleon-nucleon collisions show a significantly higher particle mean transverse momentum. They can be understood as "harder" collisions, i.e. with higher 4momentum transfer squared Q 2 or as nucleon-nucleon collisions where multiple parton-parton interactions (MPI) take place.
In contrast, a centrality selection that is not expected to induce a bias on the binary scaling of hard processes is provided by the energy measurement with the Zero Degree Calorimeters (ZDC) due to their large η-separation from the central barrel detectors. They detect the so-called "slow" nucleons produced in the interaction by nuclear de-excitation processes, or knocked out by wounded nucleons [25,26]. The relationship of the energy deposited in the ZDC to the number of collisions requires a detailed model to describe the slow nucleon production. A heuristic approach, based on a parameterization of data from low energy experiments, is discussed in the present paper.
We will show that centrality estimators using forward neutron energy and those using central multiplicity give consistent results for N part and N coll , demonstrating their connection to the collision geometry. Based on the considerations outlined above we study two different procedures for centrality estimation. The first procedure is to determine the centrality with charged particle multiplicity. The collision geometry is determined by fitting the measured multiplicity distribution with the N coll distribution obtained from the Glauber model [16], convolved with a negative binomial distribution (NBD). Due to the possible dynamical bias introduced by the multiplicity selection N coll should in this case not be used to scale hard cross-sections. Additional effort is needed to understand the bias or to extend the Glauber model to include additional dynamical fluctuations. Several possible directions have been discussed, for example Glauber-Gribov fluctuations of the proton size [27] as well as fluctuations of the number of hard-scatterings per collisions due to the impact parameter dependence and purely statistical (Poissonian) fluctuations [28].
The second procedure requires a centrality selection with minimal bias and, therefore, uses the ZDC signal. To relate the ZDC signal to the collision geometry we have developed a heuristic model for slow nucleon emission, based on a parameterization of data from low energy experiments. This heuristic approach however can provide only a model-dependent N coll determination. However, one can study the correlation of two or more observables out of which at least one is expected to scale linearly with N coll . Examples are i) the target-going multiplicity proportional to the number of wounded target nucleons (N target part = N part − 1 = N coll ), ii) the multiplicity at mid-rapidity proportional to the number of participants (N part = N coll + 1), iii) the yield of hard probes, like high-p T particles at mid-rapidity proportional to N coll . These scalings can be used as an ansatz when calculating N coll based on an event selection using the ZDC.
Both alternatives are discussed in the present paper. The paper is organised as follows. Section 2 describes the experimental conditions, the event selection, and the event characterisation using the multiplicity distributions of charged particles measured in various η ranges, or the energy collected in the ZDC. Section 3 describes the centrality determination based on charged particle distributions using an NBD-Glauber fit to extract the average geometrical quantities for typical centrality classes. Section 4 presents a phenomenological model describing the relation of the energy deposited in the ZDC calorimeter and N coll . Section 5 discusses the various effects leading to a bias in the centrality measurements based on particle multiplicity. Section 6 introduces a hybrid method, where we use the ZDC to characterize the event activity, and base the determination of N coll on the assumption that N part -scaling holds for the central pseudorapidity multiplicity density or N target part -scaling for particle production in the target region. Section 7 discusses the implications of the different choices of a centrality estimator on the physics results, such as the nuclear modification factors, or the pseudorapidity density of charged particles at mid-rapidity. Section 8 summarizes and concludes the paper.

Experimental Conditions
The data were recorded during a dedicated LHC run of 4 weeks in January and February 2013. Data have been taken with two beam configurations, by inverting the direction of the two particle species, referred to as p-Pb and Pb-p, respectively, for the situations where the proton beam is moving towards positive rapidities, or vice versa. The two-in-one-magnet design of the LHC imposes the same magnetic rigidity of the beams in the two rings ,implying that the ratio of beam energies is fixed to be exactly equal to the ratio of the charge/mass ratios of each beam. Protons at 4 TeV energy collided onto fully stripped 208 82 Pb ions at 1.58 TeV per nucleon energy resulting in collisions at √ s NN = 5.02 TeV in the nucleon-nucleon centre-of-mass system (cms), which moves with a rapidity of ∆y NN = 0.465 in the direction of the proton beam. In the following, we will use the convention that y stands for y cms , defined such that the proton moves towards positive η cms , while η stands for η lab .
The number of colliding bunches varied from 8 to 288. The proton and Pb bunch intensities were ranging from 0.2 ×10 12 to 6.5 ×10 12 and from 0.1 ×10 12 to 4.4 ×10 12 , respectively. The luminosity at the ALICE interaction point was up to 5 ×10 27 cm −2 s −1 resulting in a 10 kHz hadronic interaction rate. The RMS width of the interaction region is 6.3 cm along the beam direction and of about 60 µm in the direction transverse to the beam.
The ALICE apparatus and its performance in the LHC Run 1 are described in [29] and [30], respectively. The main detector components used for the centrality determination are: the Silicon Pixel Detector (SPD), two cylindrical layers of hybrid silicon pixel assemblies covering |η| < 2.0 for the inner layer and |η| < 1.4 for the outer layer for vertices at the nominal interaction point, with 93.5% active channels; the Time Projection Chamber (TPC), a large cylindrical drift detector covering |η| < 0.9; the VZERO scintillator counters, covering the full azimuth within 2.8 < η < 5.1 (VZERO-A) and −3.7 < η < −1.7 (VZERO-C); and the Zero Degree Calorimeters (ZDC), two sets of neutron (ZNA and ZNC) and proton (ZPA and ZPC) calorimeters positioned at ±112.5 m from the interaction point, with an energy resolution of about 20% for the neutron and 24% for the proton calorimeters.
The p-Pb trigger, configured to have high efficiency for hadronic events, requires a signal in both the VZERO-A and VZERO-C (VZERO-AND requirement). Beam-gas and other machineinduced background collisions with deposited energy above the thresholds in the VZERO or ZDC detectors are suppressed by requiring the signal arrival time to be compatible with a nomi-nal p-Pb interaction. The fraction of remaining beam-related background after all requirements is estimated from control triggers on non-colliding or empty bunches, and found to be negligible.
The resulting event sample corresponds to a so-called visible cross-section of 2.1 ± 0.06 barn measured in a van der Meer scan [31]. From Monte Carlo simulations we expect that the sample consists mainly of non-single diffractive (NSD) collisions and a negligible contribution of single-diffractive (SD) and electromagnetic interactions. The VZERO-AND trigger is not fully efficient for NSD events. Previous Monte Carlo studies (for details see [23]) have shown that the inefficiency is observed mostly for events without reconstructed vertex, i.e. with no particles produced at central rapidities. Given the fraction of such events in data (1.5%), the corresponding inefficiency was found to be 2.2% with a large systematic uncertainty of 3.1%. Correcting for this inefficiency would mainly concern the most peripheral class (80-100%) where the correction amounts up to 11% ± 15.5%. For the results reported in this paper, centrality classes have been defined as percentiles of the visible cross-section and the measurements are not corrected for trigger inefficiency.
The centrality determination is performed by exploiting the rapidity coverage of the various detectors. The raw multiplicity distributions measured in the Central Barrel are modelled by assuming particle production sources are distributed according to a NBD. The zero-degree energy of the slow nucleons emitted in the nucleon fragmentation requires more detailed models.
In this context, the main estimators used for centrality in the following are: − CL1: the number of clusters in the outer layer of the silicon pixel detector, |η| < 1.4; − V0A: the amplitude measured by the VZERO hodoscopes on the A-side (the Pb-going side in the p-Pb event sample), 2.8 < η < 5.1; − V0C: the amplitude measured by the VZERO hodoscopes on the C-side (the p-going side in the p-Pb event sample), −3.7 < η < −1.7; − V0M: the sum of the amplitudes in the VZERO hodoscopes on the A-and C-side (V0A+V0C); − ZNA: the energy deposited in the neutron calorimeter on the A-side (the Pb-going side in the p-Pb event sample).

NBD-Glauber Fit
To determine the relationship between charged particle multiplicity and the collision properties, such as the number of participating nucleons N part , binary pN collisions N coll , or nuclear overlap T pPb (= N coll /σ inel NN ), it is customary to use the Glauber Monte Carlo model combined with a simple model for particle production [32][33][34][35][36]. The method was used in Pb-Pb collisions and is described in detail in [37]. In the Glauber calculation, the nuclear density for 208 82 Pb is modelled by a Woods-Saxon distribution for a spherical nucleus ρ(r) = ρ 0 1 1 + exp r−R a (2) V0A (Pb-side) amplitude (a.u.) with ρ 0 the nucleon density, which provides the overall normalization, a radius of (R = 6.62 ± 0.06) fm and a skin depth of a = (0.546 ± 0.010) fm, based on data from low energy electronnucleus scattering experiments [38]. Nuclear collisions are modelled by randomly displacing the projectile proton and the target Pb nucleus in the transverse plane. A hard-sphere exclusion distance of 0.4 fm between nucleons is employed. The proton is assumed to collide with the nucleons of the Pb-nucleus if the transverse distance between them is less than the distance corresponding to the inelastic nucleon-nucleon cross-section of 70 ± 5 mb at √ s = 5.02 TeV, estimated from interpolating data at different centre of mass energies [39] including measurements at 2.76 and 7 TeV [40]. The VZERO-AND cross-section measured in a van der Meer scan [31] was found to be compatible, assuming negligible efficiency and EM contamination corrections, with the Glauber-derived p-nucleus inelastic cross-section (2.1 ± 0.1 b). The Glauber-MC determines on an event-by-event basis the properties of the collision geometry, such as N part , N coll and T pPb , which must be mapped to an experimental observable.
Assuming that the average V0A multiplicity is proportional to the number of participants in an individual p-A collision, the probability distribution P(n) of the contributions n to the amplitude from each p-nucleon collisions can be described by the NBD, defined as: where Γ is the gamma function, µ the mean amplitude per participant and the dispersion parameter k is related to the relative width given by σ /µ = 1/µ + 1/k. From the closure of the NBD under convolution, it follows that the conditional probability P(n|N part ), i.e. N part repeated convolutions, is equal to P(n; N part µ, N part k).
To obtain the NBD parameters µ and k, the calculated V0A distribution, obtained by convolving the Glauber N part distribution with P(n|N part ), is fitted to the measured V0A distribution. The fit is performed excluding the low V0A amplitude region, V0A< 10. We note however that  Table 1: Fit parameters of the N part × NBD for pp collisions at 7 TeV and p-Pb multiplicity distributions.   fitting with the full range gives consistent results. The measured V0A distribution together with the NBD-Glauber distribution for the best fit are shown in Fig. 1. Similar fits have been performed to V0M and CL1 and the corresponding fit parameters are listed in Table 1. The values of the parameters µ and k are similar to those obtained by fitting the corresponding multiplicity distributions in pp collisions at 7 TeV. Since the raw distribution is sensitive to experimental parameters such as noise and gain, one cannot expect identical values even in the case of perfect N part -scaling.
For a given centrality class, defined by selections in the measured distribution, the information from the Glauber MC in the corresponding generated distribution is used to calculate the mean number of participants N part , the mean number of collisions N coll , and the average nuclear overlap function T pPb . These are given in Table 2, with the corresponding σ values. Since the event selection dominantly selects NSD events, it is important to note that the number of participants in the Glauber calculation would increase by only 2.5% for NSD events. This was estimated with a modified Glauber calculation to exclude SD collisions [23].
The systematic uncertainties are evaluated by varying the Glauber parameters (radius, skin depth and hard-sphere exclusion distance) within their known uncertainty. The uncertainties on N coll are listed in Table 3, by adding all the deviations from the central result in quadrature.
The uncertainties range from about 4-5% in peripheral collisions to about 10% in central collisions. Note that, as T pPb = N coll /σ inel NN , the uncertainties on σ inel NN and N coll largely cancel in the calculation of T pPb . However, edge effects in the nuclear overlap are large for T pPb in peripheral collisions.
The procedure was tested with a MC-closure test using HIJING p-Pb simulations [28] with nuclear modifications of the parton density (shadowing) and elastic scattering switched off. In the MC-closure test the V0A distribution obtained from a detailed detector simulation coupled to HIJING was taken as the input for the fit with the NBD-Glauber method. The difference between the N coll values calculated from the fit, and those from the MC truth used in the HIJING simulation range from 3% in central to 23% in peripheral events (see Table 3). The large uncertainty in the peripheral events arises from the small absolute values of N coll itself. In this case a small absolute uncertainty results in a large relative deviation. The total uncertainty on N coll for each centrality class with the CL1, V0M or V0A estimators is obtained by adding the uncertainty from the variation of the Glauber parameters with those from the respective MC-closure test in quadrature.
The NBD-Glauber fit is repeated for the multiplicity distribution of the SPD clusters (CL1) and for the sum of V0A and V0C, V0M, in the same centrality classes as for V0A. The N coll values as a function of centrality are given in Table 3  estimators. In addition, the events from the MC-Glauber calculation were ordered according to their impact parameter, and the values of N coll were extracted for the same centrality classes. The variation of N coll between different centrality estimators is small and of similar magnitude as the systematic uncertainty obtained by adding in quadrature the uncertainty from the Glauber model and from the MC-closure test. This implies that the N coll determination with the NBD-Glauber fit is robust and independent of the centrality estimator used. An equivalent agreement between estimators was found for the other geometric properties.

Glauber-Gribov corrections
Event-by-event fluctuations in the configuration of the incoming proton can change its scattering cross-section [27]. In the Glauber MC this phenomenon is implemented by an effective scattering cross-section [41][42][43]. At high energies, the configuration of the proton is taken to be frozen over the time scale of the p-A collision. Analogously to the studies in [44,45], the effect of these frozen fluctuations of the projectile proton is evaluated with a modified version of the Glauber MC, referred to as "Glauber-Gribov". This version includes event-by-event variations of the nucleon-nucleon cross-section. Here we have used the same values of the parameter Ω, which controls the width of the probability distribution of σ inel NN , as used in [44], namely Ω = 0.55 and 1.01, where Ω = 0.0 corresponds to the standard Glauber.
The distribution of the number of participants, N part , obtained from the two Glauber-Gribov parameter variations are shown in the left panel of Fig. 3 together with a standard N part distribution obtained using a fixed inelastic cross-section, σ inel NN = 70 mb. The Glauber-Gribov N part distributions are much broader than the Glauber distribution due to the cross-section fluctuations. We note that by construction the total inelastic p-Pb cross-section is unaltered by the proton fluctuations.
The Glauber-Gribov distributions of N part and N coll , coupled to a NBD, were fitted to the measured distribution of V0A. The right panel of Fig. 3 shows the V0A distribution together with various fits performed with the standard Glauber or the Glauber-Gribov distribution, using Ω = 0.55, and assuming that the signal increases proportionally either to N part or to N coll . As Distribution µ k Std-Glauber and N coll ⊗ NBD 12.2 0.58 Std-Glauber and N part ⊗ NBD 11.0 0.44 Glauber-Gribov and N coll ⊗ NBD 12.6 1. 35 Glauber-Gribov and N part ⊗ NBD 11.0 0.60 Table 4: Fit parameters of the V0A distributions using standard Glauber and Glauber-Gribov (Ω = 0.55) distributions of N part and N coll coupled to a NBD.  before, no attempt is made to describe the most peripheral region (below ∼90%), where trigger efficiency is not 100%. The extracted parameters are given in Table 4.
The standard NBD-Glauber fits yield satisfactory results using either the N part or the N collscaling, which result in similar average number of collisions N coll evaluated for each of the centrality intervals as shown in Table 5. The Glauber-Gribov fits with Ω = 0.55 provide an equally good description of the measured V0A distribution as the standard Glauber, indicating that the fits cannot discriminate between the models. The broader N part distributions in the Glauber-Gribov models require smaller intrinsic fluctuations in the NBD at fixed N part . No satisfactory fit is obtained with Ω = 1.01. As expected, the corresponding values of N coll also shown in Table 5, are larger (smaller) for central (peripheral) than those obtained from the standard Glauber, as a consequence of the different shapes of the N part distributions in these models (see Fig. 3 (left)). Both assumptions that the multiplicity distribution is proportional to N part or N coll are found to give an equally good description of the experimental data (see Fig. 3, and parameters reported in Table 4) and the difference in the extracted geometric quantities is small (see Table 5).

Centrality from the Zero-Degree Energy
The energy measured in the zero degree calorimeters (ZDC) can be used to determine the centrality of the collision. The ZDC detects the so called "slow" nucleons produced in the interaction: protons in the proton ZDC (ZP) and neutrons in the neutron ZDC (ZN). The multiplicity of slow nucleons is expected to be monotonically related to N coll [25] and can therefore be used as a centrality estimator.
Emitted nucleons are classified as "black" or "grey". This terminology originates from emulsion experiments where it was related to the track grain density. Black particles, typically defined to have velocity β 0.25 in the nucleus rest frame, are produced by nuclear evaporation processes, while grey particles, 0.25 β 0.7, are mainly nucleons knocked out from the nucleus. Experimental results at lower energies show that the features of the emitted nucleons, such as angular, momentum and multiplicity distributions, are weakly dependent on the projectile energy in a wide range from 1 GeV up to 1 TeV ( [25] and references therein). These observations suggest that the emission of slow particles is mainly dictated by nuclear geometry.
To relate the energy deposited in the ZDC to the number of binary collisions quantitatively requires a model to describe the production of slow nucleons. Since there are no models available that are able to describe the slow nucleon emission at LHC energies, we relied on the weak dependence on collision energy and followed a heuristic approach. For this purpose we developed a model for the slow nucleon emission (SNM) based on the parameterization of experimental results at lower energies.
In the left panel of Fig. 4 it is shown that the energy detected by the neutron calorimeter on the Pb-remnant side (ZNA) is correlated with the energy detected in the proton ZDC (ZPA), up to the onset of a saturation in the emission of neutrons. This saturation effect is commonly attributed to the black component ( [25] and references therein). The energy detected by ZP is lower. This is due both to the lower number of protons in the Pb nucleus and to the lower acceptance for emitted protons that are affected by LHC magnetic fields. Furthermore, contrary to ZN, ZP response and energy resolution strongly depend on the proton impact point. In the following we focus on the ZN spectrum for these reasons.
The energy released in the ZNA is anti-correlated with the signal in the neutron calorimeter placed on the p-remnant side (ZNC) (see Fig. 4, right). The p-remnant side ZN signal cannot be easily calibrated in energy units due to the lack of peaks in the spectrum. Events characterized by low N coll values, corresponding to low energy deposit in ZNA, have the largest contribution in ZNC. This implies that the participant contribution cannot be neglected for very peripheral events, where the sample is also partially contaminated by electromagnetic processes. Therefore supposing that no nucleons are emitted in the limit that there is no collision, the model is not expected to provide a complete and reliable description for very peripheral data.
In the following, we briefly summarize the main ingredients of the developed heuristic model for slow nucleon emission. The average number of emitted grey protons is calculated as a function of N coll using a second order polynomial function: This relationship was found to be in good agreement with grey proton data measured by E910 in p-Au collisions with 18 GeV/c proton beam [46]. The coefficient values taken from the E910 fit are rescaled to Pb nuclei using the ratio (Z Pb /Z Au ): c 0 = −0.24, c 1 = 0.55, c 2 = 0.0007. The linear term is the dominant contribution while the quadratic term is negligible. Neglecting in this context a possible saturation effect for black protons, we approximate the average number of black protons using the ratio between "evaporated" and "direct" proton production measured by the COSY experiment in p-Au interactions at 2.5 GeV [47]: where N LCF is the number of Light Charged Fragments, namely the number of fragments with Z< 8. Since we cannot directly measure the number of light charged fragments in ALICE, we assumed that N LCF is proportional to the number of slow protons as measured by COSY [47]: N LCF = γ · N slow p where the proportionality factor γ = 1.71 is obtained through a minimization procedure. The first term in eq.(5) describes the grey neutron production that linearly increases with N coll and hence with N LCF . The second term reproduces the saturation in the number of black nucleons, and is based on a parameterization of results from the COSY experiment where the neutron yield is related to N LCF [47]. The values of the parameters α, a, b and c are obtained through a minimization procedure and are: The relative fraction of black and grey neutrons is evaluated assuming that 90% of the emitted neutrons are black, as measured in proton induced spallation reactions in the energy range between 0.1 and 10 GeV [48]. The number of nucleons emitted from 208 82 Pb is finally calculated event by event as a function of N coll , assuming binomial distributions with probabilities p = N slow p /82 for protons and p = N slow n /126 for neutrons.
The kinematical distributions of the black and the grey components are described by independent statistical emission from a moving frame: black nucleons are emitted from a stationary source, while grey nucleons from a frame slowly moving along the beam direction with β grey = 0.05. The angular distribution for grey tracks is forward peaked in the polar angle θ , while black nucleons are assumed to be uniformly distributed, in agreement with the experimental observations [46,49].
The average number of slow neutrons is obtained using the following formula: The neutron ZDC acceptance has been calculated coupling an event generator based on the SNM to HIJING [28] and using a full GEANT 3 [50] description of the ALICE experimental apparatus. Taking into account the experimental conditions (beam crossing angle and detector configuration), we obtain that 94% of the events have a signal in the neutron calorimeter, in good agreement with the experimental acceptance (95.6%). Since the events without ZNA signal have the same CL1, V0A and V0M distributions as the those in the 80-100% centrality bin they are attributed to this bin.
The SNM, coupled to the probability distribution for N coll calculated from the Glauber MC as in Sec. 3, is fitted to the experimental distribution of the ZDC energy in The N coll , reported in Table 3 and in Fig. 2, is then calculated for centrality classes defined by dividing the energy spectrum in percentiles of the hadronic cross-section. The systematic uncertainty on the N coll values reported in Table 3, has been evaluated by varying the model parameters within reasonable ranges: i) using for the relative fraction of black over grey protons N black p = 0.43 N grey p from spallation reaction results [48], ii) including a saturation effect for black protons, iii) decreasing the ratio of black over grey neutrons to 0.5 as obtained from DPMJET [51], iv) neglecting the linear term in Eq. 5 and assuming complete saturation for the neutrons, v) varying γ by ±10% and vi) assuming different parametrization for the fluctuations in the number of slow nucleons for a fixed N coll value. When using the N coll values for the ZNA centrality estimator, the total systematic uncertainty on N coll is obtained by adding the uncertainties from the Glauber and SNM parameters in quadrature.
Within the Glauber-model, the consistency between measurements of N coll in largely separated rapidity regions establishes their relation to centrality. To this end, we correlate the ZNA measurements to the amplitudes measured in the innermost ring of the VZERO-A detector (V0A ring1), since this ring covers the most forward rapidity in the Pb-going direction. The N coll distributions (P(N coll |cent ZNA )) for centrality classes selected with ZNA (cent ZNA ) are obtained from the SNM-Glauber fit. These are convolved with the NBD obtained from the NBD-Glauber fit to the MB V0A ring1 distribution. Fig. 6 compares the distributions of V0A ring1 obtained from these convolutions to the ones measured in the same ZNA centrality classes. As expected, the distributions in the most peripheral events, where the SNM does not provide a reliable description of the data, are not well reproduced by the Glauber-MC convolution. In all other classes, the experimental distributions are well reproduced. The deviations are consistent with the ones between N ZNA coll (see Table 3) and N Pb−side coll (see Table 7) assuming that the target-going charged-particle multiplicity measured in V0A ring1 is proportional to the number of wounded target nucleons.
In addition, Fig. 6 shows the results of an unfolding procedure. For each V0A ring1 distribution selected by a ZNA centrality class, we find the N coll distribution that, convolved with the NBD MB , fits the data, i.e. the parameters of the fit are the relative contributions of each N coll bin. The unfolded distributions (shown in blue) agree well with the data for all centrality bins, apart from a small discrepancy in the 80-100% distribution at low amplitude, which is affected by trigger and event selection efficiency, however. The N MB coll distribution which results from the sum of the unfolded distributions of all centrality bins agree well with the one from Glauber-MC. The existence of N coll distributions that folded with NBD agree with measured signal distributions is a necessary condition for ZNA to behave as an unbiased centrality selection. In contrast, it is worth noting that a centrality selection based on central multiplicity, as CL1, has no such solution; i.e. no such good agreement can be found when the V0A ring1 distributions are selected by ordering the events according to CL1. The biases related to centrality selection will be discussed in the next section. The assumption that the ZNA selection is bias free will be used in Sec. 6 as an ansatz for the hybrid method.

Multiplicity Bias
Section 3.1 describes the NBD-Glauber fitting procedure used to determine the collision geometry in terms of N coll and N part , for each centrality class. The NBD is used to account for multiplicity fluctuations at fixed N part . In contrast to Pb-Pb collisions, for p-Pb collisions these multiplicity fluctuations are sizeable compared to the width of the N part distribution, as illustrated in Fig. 7. For large fluctuations, a centrality classification of the events based on multiplicity may select a sample of nucleon-nucleon collisions which is biased compared to a sample defined by cuts on the impact parameter b.
This selection bias, which occurs for any system with large relative statistical fluctuations in particle multiplicity per nucleon-nucleon collision, can be quantified using the Glauber fit itself. The left panel of Fig. 8 shows the ratio between the average multiplicity per average participant and the average multiplicity of the NBD as a function of centrality. In Pb-Pb collisions, where the width of the plateau of the N part distribution is large with respect to multiplicity fluctuations, the ratio deviates from unity only for the most peripheral collisions. As expected, in p-Pb collisions the ratio differs from unity for all centralities with large deviations for the most central and most peripheral collisions; the most central (peripheral) collisions have on average much higher (lower) multiplicity per participant. When selecting event classes using impact parameter b intervals, there is no deviation from unity, as expected. The right panel of Fig. 8 shows for each centrality estimator the relative width of the NBD distribution (σ /µ). As expected, the estimators with the largest bias on the multiplicity per participant correspond to those with the largest relative width.
It is instructive for the further discussion to consider the clan model [52], which is the standard physical explanation of the NBD distribution in the context of particle production in pp collisions. In this model particle sources, called ancestors, are produced independently according to a Poisson distribution with mean value, N = k · ln (1 + µ/k). Each ancestor can produce on average µ/ N particles, e.g. by decay and fragmentation, and a clan contains all particles that stem from the same ancestor. Hence, the bias observed above also corresponds to a biased number of clans, which are sources of particle production. Analogously, in all recent Monte Carlo generators a large part of the multiplicity fluctuations is indeed due to the fluctuations of the number of particle sources, i.e. multiple semi-hard (Q 2 ≫ Λ QCD ) parton-parton scatterings (MPI).
As an example, the HIJING generator accounts for fluctuations of the number of MPI per NN interaction via an NN overlap function T NN (b NN ), where b NN is the NN impact parameter, i.e. the impact parameter between the proton and each wounded nucleon of the Pb-nucleus. The probability for inelastic NN collisions is given as one minus the probability to have no b (fm)  interaction: where σ soft is the geometrical soft cross-section of 57 mb [28] related to the proton size and σ hard the energy dependent pQCD cross-section for 2 → 2 parton scatterings. Further, as in the clan model, there is a Poissonian probability for multiple hard collisions with an average number determined by b NN : Hence, the biases on the multiplicity discussed above correspond to a bias on the number of hard scatterings (n hard ) and b NN in the event. The latter correlates fluctuations over large rapidity ranges (long range correlations). As a consequence, for peripheral (central) collisions we expect  a lower (higher) than average number of hard scatterings per binary collision, corresponding to a nuclear modification factor less than one (greater than one).
In general, the number of binary pN collisions, N coll , is used to scale the reference pp yields and obtain the nuclear modification factor, used to quantify nuclear matter effects. However, for centrality classes based on multiplicity, owing to the bias induced by such selection, hard processes do not simply scale with N coll but rather with an effective number of collisions, obtained by scaling the N Glauber coll by the number of hard scatterings per pN collision: N Glauber coll n hard pN / n hard pp . As discussed in the HIJING example above, the number of hard scatterings per pN collision is simulated in Monte Carlo models. In this specific MC, even with- out bias, the total number of hard scatterings deviates from simple N coll -scaling due to energy conservation at high N coll . Instead, with the objective to study a baseline corresponding to an incoherent and unconstrained superposition of nucleon-nucleon collisions, the PYTHIA [53] event generator has been coupled to the p-Pb MC Glauber calculation. For each MC Glauber event PYTHIA is used to generate N coll independent pp collisions. In the following we refer to this model as G-PYTHIA. In this model, the number of hard scatterings per pN collision shows a strong deviation from N coll scaling which is illustrated in Fig. 9 and resembles the bias observed in Fig. 8.

Jet-veto bias
Additional kinematic biases exist for events containing high-p T particles. These particles arise from the fragmentation of partons produced in parton-parton scattering with large momentum transfer. Their contribution to the overall multiplicity rises with parton energy and, thus, can introduce a trivial correlation between the centrality estimator and the presence of a high-p T particle in the event. In particular, for very peripheral collisions, the multiplicity range that governs the centrality for the bulk of soft collisions can represent an effective veto on hard processes, leading to a R pPb < 1. This bias is illustrated in Fig. 10. It shows a multiplicity distribution which is used as centrality estimators in p-Pb collisions, compared to the same distribution in pp collisions at √ s = 7 TeV. The dashed lines mark the 80% and the 60% percentile of the p-Pb cross-section. The fraction of the pp cross-section selected with the 80-100% (60-100%) p-Pb multiplicity cut is 0.8 (0.97). The 80% cut in p-Pb is smaller than the multiplicity range covered in pp, therefore resulting in an effective veto on the large multiplicity events produced by hard processes.

Geometric bias
The b NN dependence of particle production postulated in section 5.1 leads to a purely geometrical, centrality estimator independent bias for peripheral p-Pb collisions [54]. As illustrated in Fig. 11, the mean impact parameter between two nucleons (b NN ), calculated from a Monte Carlo Glauber simulation, is almost constant for central collisions, but rises significantly for N part < 6. This reduces the average number of MPIs for most peripheral events, enhancing the effect of the bias leading to a nuclear modification factor less than (greater than) one for peripheral (central) collisions.
In summary, based on simplified models we have identified three different possible biases that are expected to lead to deviations from unity at high p T of the nuclear modification factors in peripheral and central collisions. As will be discussed and studied in the following sections, the effect decreases with increasing rapidity separation between the R pPb measurement and the centrality estimator.
For the estimators we used, the main biases are: 1. CL1: strong bias due to the full overlap with tracking region. Additional bias from the "jet veto effect", as jets contribute to the multiplicity and shift events to higher centralities (p T -dependent) ; 2. V0M: reduced bias since the VZERO hodoscopes are outside the tracking region; 3. V0A: reduced bias because of the enhanced contribution from the Pb fragmentation region; 4. ZNA: no bias expected.
In addition, independent of the centrality estimator, there is a geometrical bias for peripheral collisions (see Sec.5.3).

Basis and assumptions of the method
The hybrid method presented in the following section aims to provide an unbiased centrality estimator and relies on two main assumptions. The first is to assume that an event selection based on ZN does not introduce any bias on the bulk at mid-rapidity and on high-p T particle production. This selection was also used in the method proposed in Sec.4, however the N coll determination provded by the SNM-Glauber model is model-dependent. In contrast, in the hybrid method, the N coll determination is based -as an ansatz-on a particular scaling for particle multiplicity (the second assumption), e.g. we assume that the charged-particle multiplicity measured at mid-rapidity scales with the number of participants.
To obtain more insight into the particle production mechanisms, we study the correlation of various pairs of observables that, in ZN-centrality classes, are expected to scale linearly with N part or N coll . One of these observables is the charged-particle density dN ch /dη in |η| < 2.0, measured with the SPD. The charged particle pseudorapidity density is obtained from the measured distribution of tracklets, formed using the position of the primary vertex and two hits, one on each SPD layer [23]. At larger pseudorapidities, where a direct multiplicity measurement is not available, we study the raw signals of the four rings of VZERO-A and VZERO-C detectors separately. We exploit both beam configurations, p-Pb and Pb-p in order to cover the widest possible rapidity range. To take into account the impact of secondary particles, the pseudorapidity coverage of the VZERO detector rings with respect to the primary charged particles was calculated with a full detector simulation based on DPMJET [50,51] and it is given in Table 6 in the centre-of-mass system (cms), which moves with a rapidity of ∆y NN = 0.465 in the direction of the proton beam (see Sec. 2).  The information about charged particle multiplicity, dominated by soft particles, is complemented with observables from hard processes which are expected to scale with the number of binary collisions, such as the yield of high-p T (10 <p T < 20 GeV/c) particles measured at mid-rapidity (|η| < 0.3).
In order to compare these observables on the same scale and also, at first order, to neglect detector efficiency and acceptance effects, we use so called normalized signals S i / S MB . These are obtained dividing S i , i.e. the mean value of dN ch /dη, number of raw SPD tracklets or raw VZERO signal in a given ZN-centrality class i, by the corresponding mean values in minimum bias collisions. Figure 12 shows, for bins in ZN centrality, the correlation between a few selected normalized signals and the normalized charged-particle density averaged over −1 < η < 0. The statistical uncertainty is negligible, while the systematic uncertainties largely cancel in the ratio to the MB signals. One can note that the correlation exhibits a clear dependence on the pseudorapidity of the normalized signal. The slope of the normalized signals with dN ch /dη diminishes towards the proton direction (C-side in p-Pb collisions). For example, in the innermost ring of the VZERO-C detector the signal amplitude range is about a factor three, while for the innermost ring of the VZERO-A detectors it is about twice as large.
In the Wounded Nucleon Model [17], the total number of participants N part is expressed in terms of target and projectile participants. The charged particle density at mid-rapidity is thus proportional to N part , whereas at higher rapidities the model predicts a dependence on a linear combination of the number of target and projectile participants with coefficients which depend on the rapidity. Close to Pb-rapidity a linear wounded target nucleon scaling (N target part = N part -1) is expected.
In order to further understand the relative trends of the observables in Fig. 12 and to relate them with geometrical quantities, such as N part , one can adopt the Wounded Nucleon Model and make the assumption that dN ch /dη in −1 < η < 0 is proportional to N part . In this case, the other observables can be related to N part , assuming linear or power-law dependence. The   linear dependence can be parameterized with N part − α, where α is a free parameter. Then the normalized signals can be expressed with (N part − α)/ N part − α and one obtains the following linear relation: (9) where N part MB = 7.9 is the average number of participating nucleons in minimum bias collisions. The relation is used to find α for each observable by a fit to the data. Analogously, we can also fit a power-law function as: where the w i are the width of the centrality classes and β is a fit parameter. Since we made the assumption that dN ch /dη in −1 < η < 0 is proportional to N part , β obtained from Eq. 10 equivalently quantifies the deviations from a perfect N part (β = 1) scaling. As can be seen from the lower panels of Fig. 12 the power-law fit describes the data better, especially for the observables located further away from mid-rapidity. This also means that the linear dependence assumed in Eq.9 can only be valid approximately. Fig. 13 shows the results of the fits in Eq. 9 and 10 as a function of η cms of the measured observables. The figure displays data collected in both p-Pb and Pb-p beam configurations.
Since the Pb-p data were taken at high-luminosity (reaching 200 kHz, roughly corresponding to a luminosity of 10 29 s −1 cm −2 ), the results are affected by interaction pile-up (probability per bunch crossing between 3.8-4.3%). In order to reduce the effect of the pile-up and to treat p-Pb and Pb-p data consistently, we excluded the 0-5% centrality class from the fits. Furthermore, in order to take into account the remaining distortions in the 5-100% classes, the Pb-p data were corrected using the results for the tracklets (also shown in Fig. 13) in a small η-region, (|η lab | < 0.2), where |η cms | is nearly identical for p-Pb and Pb-p configurations. Typically the absolute correction is 0.05 and 0.01 for the α and β parameters, respectively. The results presented in Fig. 13 indicate a smooth and continuous change of the scaling behaviour for charged particle production with pseudorapidity. It is worth noting that at large negative pseudorapidity (Pb-going direction) the values of the parameters α and β reach those obtained for charged-particle production at high-p T . In contrast, the parameter values are much lower in the proton-going direction. Our data are overlaid with the corresponding fit parameters derived from PHOBOS charged-particle multiplicity measurements in d-Au collisions at √ s NN = 200 GeV [20]. The normalized charged-particle multiplicity in each pseudorapidity bin is fit against dN/dη i dN/dη MB |η|<0.1 using Eq. 9 and 10. The results obtained in this way are then adjusted by scaling the x-axis (η cms ) by the ratio of the beam rapidities in p-Pb at √ s NN = 5.02 TeV and d-Au collisions at √ s NN = 200 GeV. The comparison between PHOBOS and our data shows a good agreement over a wide η range, with some deviations at large negative pseudorapidity. In particular, the η region covered by the innermost ring of the VZERO-A detector corresponds to the target fragmentation region where extended longitudinal scaling was observed at RHIC [20]. The minimum bias N part and N coll are obtained by PHOBOS relying on a tuned HIJING-based Monte Carlo simulation [20].

Calculation of N coll
As discussed in the previous section, selecting the events using the ZN signal is expected to be free from bias on the bulk multiplicity or high-p T particle yields. In order to establish a relationship to the collision geometry, we exploit the findings from the correlation analysis described above and make use of observables that are expected to scale as a linear function of N coll or N part .
Three sets of N coll values are calculated, based on the following assumptions: 1. N mult coll : the charged-particle multiplicity at mid-rapidity is proportional to the number of participants (N part ); 2. N high−p T coll : the yield of charged high-p T particles at mid-rapidity is proportional to the number of binary NN collisions (N coll ); 3. N Pb−side coll : the target-going charged-particle multiplicity is proportional to the number of wounded target nucleons (N target part = N part -1 = N coll ). For the charged-particle multiplicity in the Pb-going side we use the signal from the innermost ring of the VZERO-A detector. We note that assumptions 1) and 2) are satisfied for minimum bias collisions, where we measured a value of (dN ch /dη cms )/ N part consistent with that in inelastic pp collisions (0.97 ± 0.08) [23] and an integrated R pA (10 <p T < 20 GeV/c) = 0.995 ± 0.010 (stat.) ± 0.090 (syst.) (see Sec. 7). Therefore, in order to obtain the average number of binary NN collisions in each centrality interval, the minimum bias value of N part MB = 7.9, is scaled using the ratio of the multiplicity at mid-rapidity: In a similar way the minimum bias value of N coll MB = 6.9 is scaled using the ratio of the yield of high-p T particles at mid-rapidity to obtain N high−p T coll : where S stands for the charged-particle yields with 10 < p T < 20 GeV/c. Alternatively, one can use the Pb-side multiplicity to obtain N Pb−side coll N coll where S stands for the raw signal of the innermost ring of VZERO-A. The obtained values of N coll in ZN-centrality classes are listed in Table 7 and shown in Fig. 14  The systematic uncertainty is given by the 8% uncertainty on the N coll MB (or the 3.4% uncertainty on the T pPb MB ) listed in Table 3. We assign no uncertainty to the assumptions made for particle scaling. The differences between the three sets of values do not exceed 9% in all  centrality classes. This confirms the consistency of the assumptions used, but it does not prove that any (or all) of the assumptions are valid.
We note that these values, in particular N Pb−side coll , agree within 12% with those calculated with the SNM (see Fig. 2 and Table 3), except for the most peripheral reactions, where the SNM is inaccurate. In addition, we plot in Fig. 15 the zero degree signal from neutral particles in the proton-going direction ZNC vs N mult coll . Over a wide range of centralities (10-100%) a linear anti-correlation is observed. This is consistent with a longitudinal energy transfer of the proton proportional to the number of binary collisions. 7 Results and implications for particle production

Charged Particle Density
The measurement of the centrality dependence of the particle multiplicity density allows a discrimination between models that describe the initial state of heavy ion collisions. In [23] we described the charged particle pseudorapidity density in minimum bias collisions. The same analysis was repeated, dividing the visible cross-section (see Sec.2) into event classes defined by the centrality estimators described above, and the N part values associated to each centrality interval were calculated using the methods discussed in Sec. 3, 4, 6.
The results of the charged particle multiplicity density as a function of the pseudorapidity are presented in Fig. 16 for different centrality intervals and different centrality estimators. The fully correlated systematic uncertainty, detailed in [23], is 2.2% and is shown, but is smaller than the marker size in the figure.
In peripheral collisions (60-80% and 80-100%) the shape of the distribution is almost fully symmetric and resembles what is seen in proton-proton collisions. In more central collisions, the shape of dN ch /dη becomes progressively more asymmetric, with an increasing excess of particles produced in the direction of the Pb beam compared to the proton-going direction. The shape of the pseudorapidity density function is sensitive to details of particle production models. For example, it was found in [23] that in minimum bias reactions the η lab dependence is described relatively well by HIJING [55] or DPMJET [51], with a gluon shadowing parameter tuned to describe experimental data at lower energy, whereas the saturation models [56][57][58]  exhibit a steeper η lab dependence than the data. We have quantified the centrality evolution of the pseudorapidity shape for the different centrality estimators by analyzing the density at mid-rapidity, and the asymmetry of particle yield between the proton and the Pb peak regions, as the ratio of dN ch /dη at 0 < η < 0.5 and −1.5 < η < −1.0, symmetrically around the centre of mass. This is shown in Fig. 17.
The left side of Figure 18 shows the dN ch /dη integrated at mid-rapidity divided by the number of participants as a function of N part for various centrality estimators. The systematic uncertainty is smaller than the marker size. For the V0A centrality estimator, in addition to the N part from the standard Glauber calculation, the results obtained with the implementation of Glauber-Gribov model (with Ω = 0.55) are also shown. For CL1, V0M, and V0A, the charged particle density at mid-rapidity has as steeper than linear increase, as a consequence of the strong multiplicity bias, which is strongest in CL1, where the overlap with the tracking region is maximum. This trend is not seen in the case of the Glauber-Gribov model, which shows a relatively constant behaviour for the integrated yield divided by the number of participant pairs, with the exception of the most peripheral point.
For ZNA, there is a clear sign of saturation above N part ∼ 10, as the N part values are closer to each other. Most probably, this is due to the saturation of forward neutron emission. We note that none of these curves point towards the pp data point. This suggests that the geometry bias, present in peripheral collisions, together with the multiplicity bias for CL1, V0M and V0A, has a large effect on this centrality class.
In contrast, the results obtained with the hybrid method, where the N Pb−side part and the N high−p T part give very similar trends, show, within ±10%, scaling with N part , which naturally reaches the pp point, well within the quoted uncertainty of 8% on the N part values. In addition, they show that the range in N part covered with an unbiased centrality selection is more limited than what is obtained using estimators based on particle multiplicity. The latter do not select on the collision geometry but rather on the final products of the collision. This effect is emphasized in the right plot, which shows the same quantity N ch divided by N part as a function of N ch . Here the limited

Nuclear modification factors
As discussed in section 5, the various centrality estimators induce a bias on the nuclear modification factor depending on the rapidity range they cover. In contrast to minimum bias collisions, where N coll = 6.9 is fixed by the ratio of the pN and p-Pb cross-sections, in general, N coll for a given centrality class cannot be used to scale the pp cross-section, or to calculate centralitydependent nuclear modification factors. For a centrality selected event sample, we therefore define Q pPb as Q pPb (p T ; cent) = dN pPb cent /dp T N Glauber coll dN pp /dp T = dN pPb cent /dp T T Glauber pPb dσ pp /dp T (15) for a given centrality percentile according to a particular centrality estimator. In our notation we distinguish Q pPb from R pPb because the former is influenced by potential biases from the centrality estimator which are not related to nuclear effects. Hence, Q pPb can be different from unity even in the absence of nuclear effects.
The p T distribution of primary charged particles in minimum bias collisions is given in [59]. The charged particle spectra are reconstructed with the two main ALICE tracking detectors, the Inner Tracking System and the Time Projection Chamber, and are corrected for the detector and reconstruction efficiency using a Monte Carlo simulation based on the DPMJET event generator [51]. The systematic uncertainties on corrections are estimated via a comparison to a Monte Carlo simulation using the HIJING event generator [28], while the p T resolution is estimated from the space-point residuals to the track fit and verified with data. The total systematic uncertainty ranges between 3.4% and 6.7% in the measured p T range, 0.15-50 GeV/c, with a negligible η cms dependence. The nuclear modification factor is calculated by dividing the data by the reference pp spectrum scaled by N coll MB . The reference pp spectrum is obtained by interpolating data measured at √ s = 2.76 TeV at low p T (p T < 5 GeV/c), and by scaling the measurements at √ s = 7 TeV with the ratio of spectra calculated with NLO pQCD at √ s = 5.02 and 7 TeV. The systematic uncertainty, given by the largest of the relative systematic uncertainties of the spectrum at 2.76 or 7 TeV at low-p T , and assigned from the relative difference between the NLO-scaled spectrum for different scales and the difference between the interpolated and the NLO-scaled data at high-p T , ranges from 6.8% to 8.2%. The nuclear modification factor R pPb is consistent with unity for p T above 6 GeV/c.
The same analysis was repeated dividing the visible cross-section (see Sec.2) in event classes defined by the centrality estimators described above, and the Q pPb were calculated using the values of N coll listed in Tables 3 and 7, for each given estimator. Figure 19 shows the Q pPb for different centrality estimators and different centrality classes. The uncertainties of the p-Pb and pp spectra are added in quadrature, separately, for the statistical and systematic uncertainties. The systematic uncertainty on the spectra is only shown for the V0A 0-5% centrality bin and is the same for all others, since all the corrections are independent of centrality. The total systematic uncertainty on the normalization, given by the quadratic sum of the uncertainty on the normalization of the pp data and the normalization of the p-Pb data, amounts to 6.0% and is shown as a gray box around unity. The systematic uncertainty on T pPb is shown as a light blue box around unity. For simplicity, we draw only the uncertainty for the minimum bias value T pPb MB .
As expected, for CL1, V0M and V0A, Q pPb strongly deviates from unity at high p T in all centrality classes, with values well above unity for central collisions and below unity for peripheral collisions. However the spread between centrality classes reduces with increasing rapidity gap between the range used for the centrality estimator and that used for the p T measurement. There is a clear indication of the jet-veto bias in the most peripheral CL1 class, where Q pPb has a significant negative slope since the jet contribution to the total multiplicity increases with p T . This jet-veto bias diminishes for V0M, and is absent for V0A, where Q pPb < 1 for peripheral collisions, indicating that the multiplicity bias is still present.
In order to study the centrality determination biases further, the Q pPb spectra are compared to the G-PYTHIA spectra. The event centrality is obtained from the charged particle multiplicity in the rapidity region covered by each estimator in the same way as in data, and N coll is directly obtained from the Monte Carlo. The calculation is shown as lines in Fig. 19. With this approach, the general trend at high p T is reasonably well described for all centrality classes, particularly for CL1. This suggests that the proper scaling for high p T particle production is an incoherent superposition of pp collisions. The agreement, however, is not as good for the V0A and V0M estimators, since the model is not adequate for forward particle production, particularly in the target fragmentation region. G-PYTHIA also reproduces the jet-veto bias, as indicated by the good agreement of the p T dependence in the low and intermediate p T region in the most peripheral CL1 collisions.
However, for central collisions, the Q pPb values show a significant enhancement at intermediate p T ≈ 3 GeV/c (called the Cronin effect, a nuclear modification factor above unity at intermediate p T , observed at lower energies in p-A collisions [24,[60][61][62]), which increases with centrality independently of the estimator used.  Fig. 19: Q pPb spectra (points) of all primary charged particles for various centrality classes obtained with the different centrality estimators explained in the text. The lines are from G-PYTHIA calculations. The systematic error on the spectra is only shown for the V0A 0-5% centrality bin and is the same for all others. The systematic uncertainty on pp and p-Pb normalization is shown as a gray box around unity at p T = 0. The systematic uncertainty on T pPb MB is shown as a light blue box around unity at high p T .
is about 15%, and the differences in the height of the peak among centrality estimators are small with respect to the absolute increases of the p-Pb yields. The enhancement is not reproduced by our model of incoherent superposition of pp collisions. In contrast, in the low p T region, below the Cronin peak, the yield is overestimated by the model. This overestimate at low p T is expected because this p T region is dominated by soft processes and therefore is not expected to scale with N coll . On the other hand, the intermediate p T region is expected to be dominated by hard scatterings and should scale with N coll in the absence of nuclear effects. From this we can conclude that the Cronin enhancement observed is due to nuclear modification effects, as observed in other measurements [10][11][12][13], as well as in the minimum bias R pPb [2].
The bottom right plot of Fig. 19 Table 7, and are obtained with assumptions on particle production described in Sec. 6. selected by the ZNA present spectra much more similar to each other than the other estimators, as expected in the absence of a multiplicity bias. The height of the Cronin peak relative to the yield at high-p T is larger with the V0A selection, which may be seen as a sign of a remaining small bias in V0A, expected from the G-PYTHIA calculations. However, for peripheral collisions (60-80% and 80-100%), the absolute values of the spectra at high p T indicate the presence of a bias in the ZNA measurement. This is not due to the event selection, but is due to the inaccurate estimate of N coll values for peripheral events, where a small, absolute uncertainty results in a large relative deviation in the Q pPb calculation.
As discussed in Sec. 6, the hybrid method uses centrality classes selected with ZNA and N coll values determined with assumptions on particle production. Fig. 20 shows the resulting Q pPb values, Q mult pPb on the left and Q Pb−side pPb on the right panel. Here it is important to note that the ratios in the lower right panel in Fig. 19, and both panels in Fig. 20 have the same shape by construction, and only differ due to the scaling (N coll ) of the reference. The small differences among the N coll values (Table 7) are reflected in consistent Q pPb , which also remain consistent with unity at high-p T for all centrality classes. This confirms the absence of initial state effects, already observed for minimum bias collisions. The Cronin enhancement, which has already been noted in minimum bias collisions, is observed to be stronger in central collisions and nearly absent in peripheral collisions. The enhancement is also weaker at 5.02 TeV compared to 200 GeV [60]. The geometry bias, described in Sec.5.3, is still present and uncorrected, even with this method. Its effect is limited to only peripheral classes, resulting in Q pPb < 1 for 80-100%.
The mean Q pPb at high momentum for the various centrality estimators is shown as a function of centrality in Fig. 21. The centrality dependence of Q Glauber pPb extracted from multiplicity distributions is shown on the left. It is reminiscent of the multiplicity bias, and reproduced by the G-PYTHIA calculation (lines in the figure). The mean Q pPb changes less with increasing rapidity gap between the centrality estimator and the region where the p T measurement is Average Q pPb calculated (p T > 10 GeV/c) as a function of centrality, with various centrality estimators. The left panel shows results from the data (points) and from the G-PYTHIA calculation (lines). The right panel shows the results for the hybrid method, where centrality classes are selected with ZNA, and N coll are calculated with the assumptions on particle production described in Sec. 6. performed, as expected from the multiplicity bias. Instead, the Q pPb extracted with the hybrid model (Fig. 21 right) is consistent with unity and the results from the two assumptions used for the N coll calculation are in agreement.
To compare the impact of the multiplicity bias from the different estimators on the nuclear modification factors, the ratio of the spectra in pp and p-Pb in different momentum ranges (Y pPb /Y pp ) is divided by the ratio of charged particle density at mid-rapidity in pp and p-Pb (N pPb ch /N pp ch ) and it is plotted as a function of (N pPb ch /N pp ch ) in Fig. 22. Left and middle panels show the yield at high-p T (10-20 GeV/c) and around the Cronin peak (3 GeV/c), respectively. Figure 22 clearly shows the shape bias on particle spectra. Even for the same average event activity at midrapidity (corresponding to the same point on the x-axis N pPb ch /N pp ch ), the p T spectra show a small but significant dependence on the centrality estimator. This is visible as a different relative number of particles (Y pPb /Y pp ) in the intermediate (3 GeV/c) or in the high-p T (10-20 GeV/c) region. Also the height of the Cronin peak relative to the high-p T yield depends on the centrality estimator. This is shown in the right panel of Fig. 22, which plots the double ratio of the p-Pb to pp yields at 3 GeV/c and in 10-20 GeV/c ((Y pPb /Y pp ) 3GeV/c /(Y pPb /Y pp ) 10−20GeV/c ). Since, for CL1, Q pPb is not constant at high p T we plot also the ratio (Y pPb /Y pp ) 3GeV/c to the value calculated with G-PYTHIA at 3 GeV/c. The Cronin peak is clearly visible for the V0M and CL1 (with respect to G-PYTHIA) selection, and very pronounced for the V0A selection. As previously noted, the ZNA selection shows a similar trend and similar value as V0A, when restricted to the dN ch /dη range common to both estimators. However the differences are still significant, and the common range is still rather small. In particular, the height of the Cronin peak is larger with ZNA than with V0A in the common dN ch /dη range, which may be seen as a sign of a remaining small bias in V0A, confirming what is observed by G-PYTHIA calculations.
The study of the correlation between observables measured in such different parts of phase space has shown that it is possible to select similar event classes using estimators that are causally disconnected after the interaction. This is very important because this suggests that any such correlation can only arise from the initial geometry of the collision.

Summary
In summary, we have studied the centrality dependence of charged particle production, with measurements that comprise the charged particle pseudorapidity density and the nuclear modification factor. The methods to determine centrality in p-A collisions using multiplicity measurements or zero-degree energy have been presented in detail. The former induce a bias on the hardness of the pN collisions that can be quantified by the number of hard scatterings per pN collision. Low (high) multiplicity p-Pb corresponds to lower (higher) than average number of hard scatterings. For observables based on centrality estimates from multiplicity, nuclear effects should be calculated, including this bias when comparing to an incoherent superposition of pN collisions.
In contrast, the energy deposited at zero degrees by slow nucleons in the ZDC is expected to be insensitive to a multiplicity bias. Under this assumption, but in the absence of a model which properly relates the ZDC energy to the number of collisions, these are calculated assuming multiplicity scaling laws in the given kinematic ranges. In particular, we assume that the multiplicity at mid-rapidity is proportional to N part , that multiplicity in the target-going direction is proportional to the number of wounded target nucleons, or that the yield of high-p T particles is proportional to N coll . The equivalence of these assumptions has been shown and discussed. Therefore, under these assumptions, we find i) that nuclear modification factors are consistent with unity above ∼ 8 GeV/c, with no centrality dependence; ii) that the multiplicity of charged particles at mid-rapidity scales linearly with the total number of participants; iii) and that the longitudinal features of p-Pb collisions at √ s NN = 5.02 TeV, as reflected by the centrality dependence of the pseudorapidity distributions of charged particles, are very similar to those seen in d-Au collisions at RHIC energies. The latter were interpreted in support of extended longitudinal scaling in the fragmentation regions. These results represent valuable input for the study of the event activity dependence of hard probes in p-Pb collision and, hence, help to establish baselines for the interpretation of the Pb-Pb data.