Intriguing properties of multiplicity distributions

Multiplicity distributions exhibit, after closer inspection, peculiarly enhanced void probability and oscillatory behavior of the modified combinants. We discuss the possible sources of these oscillations and their impact on our understanding of the multiparticle production mechanism. Theoretical understanding of both phenomena within the class of compound distributions is presented.


Introduction
Multiplicity distributions, P (N ), are among the first observables measured in any multiparticle production experiment and are among the most thoroughly investigated and discussed sources of information on the mechanism of the production process [1]. Nevertheless, it seems that some of their properties remain unnoticed or unused as a possible source of such information. In this work we analyse the non-single diffractive (NSD) charged multiplicity distributions concentrating on two features: (i) on the observation that, after closer inspection, they show a peculiarly enhanced void probability, P (0) > P (1) [2,3], and (ii) on the oscillatory behavior of the so called modified combinants, C j , introduced by us in [4,5,6]. We demonstrate how these modified combinants can be extracted experimentally from the measured P (N ) by means of some recurrence a e-mail: maciej.rybczynski@ujk.edu.pl b e-mail: grzegorz.wilk@ncbj.gov.pl c e-mail: zbigniew.wlodarczyk@ujk.edu.pl relation involving all P (N < j), and argue that they contain information (located mainly in the small N region) which was so far not disclosed and used. This information is hidden in the specific distinct oscillatory behavior of the C j , which, in most cases, is not observed in the C j obtained from the P (N ) commonly used to fit experimental results. We discuss the possible sources of such behavior and the connection of the C j with the enhancement of void probabilities, and their impact on our understanding of the multiparticle production mechanism with the emphasis on the theoretical understanding of both phenomena within the class of compound distributions.

Modified combinants, combinants and void probabilities
The dynamics of the multiparticle production process is hidden in the way in which the consecutive measured multiplicities N are connected. In the simplest case one assumes that the multiplicity N is directly influenced only by its neighboring multiplicities (N ±1) in the way dictated by the simple recurrence relation: (N + 1)P (N + 1) = g(N )P (N ), g(N ) = α + βN. (1) The most popular forms of P (N ) emerging from this recurrence relation are: the Binomial Distribution (BD) (for which α = Kp/(1 − p) and β = −α/K), the Poisson Distribution (PD) (for which α = λ and β = 0), and the Negative Binomial Distribution (NBD) (for which α = kp and β = α/k, where p denotes the probability of particle emission), Usually the first choice of P (N ) in fitting data is a single NBD [19]. However, with growing energy and number of produced secondaries the NBD increasingly deviates from data for large N (see [4]) and is replaced either by combinations of two [20,21], three [22], or multi-component NBDs [23], or by some other form of P (N ) [1,19,24,25,26]. However, such a procedure only improves the agreement at large N , whereas the ratio R = data/f it deviates dramatically from unity at small N for all fits [4,6]. This observation, when taken seriously, suggests that there is some additional information in the measured P (N ) not covered by the recurrence relation (1), which is too restrictive. In [4] we proposed a more general form of the recurrence relation, used in counting statistics when dealing with multiplication effects in point processes [27]. Contrary to Eq.
(1), it now connects all multiplicities by means of some coefficients C j , which define the corresponding P (N ) in the following way: The coefficients C j contain the memory of particle N +1 about all the N − j previously produced particles. They can be directly calculated from the experimentally measured P (N ) by reversing Eq. (5) and putting it in the form of the following recurrence formula for C j [4]: In Fig. 1 we show the results of attempts to fit both the experimentally measured (in the CMS experiment [28]) multiplicity distributions, and the corresponding modified combinants C j calculated from these data. Note that these C j show very distinct oscillatory behavior (with a period roughly equal to 16 in this case), which gradually disappears with N . It turns out that this oscillatory pattern cannot be reproduced by the C j calculated from a single NBD, we observe no trace of oscillations in this case. They begin gradually to appear for the C j calculated from 2-NBD fits (with parameters from [21]) and become clearly visible when using 3 component NBD (with parameters from [22]). However, even in this case the C j obtained from data are not yet fully reproduced. As shown in [4,5] such oscillations of C j are seen for different pseudorapidity windows, in data from all LHC experiments and at all energies. The only condition is that the statistics of the experiment must be high enough, in cases of small statistics the oscillations become too fuzzy to be recognized [6]. For the clarity of presentation we do not show errors on the figures with modified combinants C j , leaving their discussion to Appendix A (where we investigate the sensitivity of C j 's to the uncertainties of the measurements). Actually, a single NBD is not able to reproduce data because in this case the corresponding C j behave as i.e., all C j > 0 [4,5]. Quite contrary to the NBD, the modified combinants for the BD, cf. Eq. (2), oscillate rapidly, with a period equal to 2. However, their general shape lacks the distinctive fading down feature of the C j observed experimentally. This means that BD used alone cannot explain data.
The modified combinants C j defined by the recurrence relation (6) are closely related to the combinants C ⋆ j introduced long a time ago in [7,8] by means of the generating function, G(z) = ∞ N =0 P (N )z N , as (see also [1,9,10,11,12,13,14,15,16,17,18]). Namely, Therefore, the C j can also be expressed by the generating function G(z) of P (N ) as This relation will be particularly useful later for calculation of the C j from the compound multiplicity distributions defined by some generating function G(z). Note that, although the combinants, C * j , were already known for a long time, and their possible oscillatory behavior was also known, they have so far scarcely been used and were not directly extracted from the experimental data [2,10,11,12,13,14,15,16,17,18].   [21] (red dashed line) and with a 3-component NBD proposed in [22] (dotted green line, parameters the same as in [22]). (b) The corresponding modified combinants C j emerging from the CMS data (squares) compared with the same choices of NBD as used in (a). (c) Experimental smooth multiplicity distributions P (N ) displayed for low multiplicities and for energies ranging from 0.2 TeV (pp collisions at UA5 experiment [29]) up to 8 TeV (pp collisions at ALICE experiment [30]). Note the peculiar enhancement of the void probability P (0) (rather small at 0.2 TeV but quite substantial at 8 TeV). (d) The same as in (a) but limited to small multiplicities to expose the void enhancement seen in (c); it can be reproduced only by using a 2-component compound binomial distributions (CBD, composed of NB and NBD) introduced and discussed below in Sections 3 and 4.
As in the case of the combinants, C ⋆ j , the set of modified combinants, C j , provides a similar measure of fluctuations as the set of cumulant factorial moments, K q , which are very sensitive to the details of the multiplicity distribution and were frequently used in phenomenological analyses of data (cf., [1,9]), where are the factorial moments. The K q can be expressed as an infinite series of the C j , and, conversely, the C j can be expressed in terms of the K q [1,9], Modified combinants also share with cumulants the property of additivity. For a random variable composed of independent random variables, with its generating function given by the product of their generating functions, , the corresponding modified combinants are given by the sum of the independent components. On the other hand, while cumulants are best suited to the study of the densely populated region of phase space, combinants are better suited for the study of sparsely populated regions because, according to Eq. (6), calculation of C j requires only a finite number of probabilities P (N < j) (wich may be advantageous in applications).
Concerning the void probabilities, the problem also to be addressed in this work is that, as can be seen in Fig. 1 (a), in the data on P (N ) discussed in this work one observes that P (0) > P (1). As can be seen in Figs. 1 (c − d), such behavior occurs at all energies of interest. The interesting point is that it cannot be fitted either by a single NBD or by the compositions of 2 or 3 NBD used to fit the data presented in Figs. 1 (a − b). However, as shown in Fig. 1 (d), this feature of the void probability can be nicely reproduced by a 2-component compound distribution based on the BD and NBD, which we shall discuss in Sections 3 and 4. Note that the void probability, P (0), is strongly connected with the modified combinants. Using Eqs. (10) and (11) it can be written as: Using further Eq. (6) one can show that the P (0) > P (1) property is possible only when For most multiplicity distributions we also have that P (2) > P (1), which results in additional condition, which together with Eq. (18) leads to the requirement that in this case However, this initial increase of C j cannot continue for all ranks j, rather, because of the normalization condition, ∞ j=0 C j = 1, we should observe some kind of nonmonotonic behaviour of C j with rank j in this case. Therefore, all multiplicity distributions for which the modified combinants C j decrease monotonically with rank j (like, for example, the NBD, cf. Eq. (7)) do not exhibit the enhanced void probability.

Compound distributions
Because a single distribution of the NBD or BD type cannot describe data we shall check the idea of compound distributions (CD). They are applicable when the production process consists of a number M of some objects (clusters/fireballs/etc.) produced according to some distribution f (M ) (defined by a generating function F (z)), which subsequently decay independently into a number of secondaries, n i=1,...,M , following some other (always the same for all M ) distribution, g(n) (defined by a generating function G(z)). The resultant multiplicity distribution, is a compound distribution of f and g with generating function The immediate consequence of Eq. (22) is that in the case where f (M ) is a Poisson distribution (P P D from Eq. (3)) with generating function then, for any other distribution g(n) with generating function G(z), the combinants obtained from the compound distribution h(N ) = P P D ⊗ g(n) and calculated using Eq. (12), do not oscillate and are equal to In particular, in the case when g(n) is a logarithmic distribution, g(n) = −p n /[n ln(1 − p)], for which h(N ) is the NBD with k = −λ/ ln(1−p), one gets that the above C j coincide with those derived before from the recurrence relation (6) and given by Eq. (7). This reasoning can be further generalized to all more complicated compound distributions, with any distribution itself being a compound poisson distribution. This limits the set of distributions P (N ) leading to oscillating C j only to, essentially, a BD and to all compound distributions based on it.
It is interesting to note that this result also explains nicely the apparent success of the multi-NBD type of P (N ) in fitting data on the C j [5]. This happens because the sum of the NBD, with weights given by the BD and with the respectively chosen values of k and N for each component, gives exactly the same    [28] for P (N ). (d) Fits to the C j obtained from the same data using a combination of two CBD. (e) N C j for different energies (UA5 results are from [29]). (f ) The same for different rapidity windows as measured by the CMS [28] and ALICE [30] experiments. P (N ) as the compound distribution 1 . The only difference compared with the usual compound distributions is that now the void probability, P (0), is not reproduced either by the single NBD or by any combination of NBDs. As for the modified combinants it only changes their amplitude and periods of oscillations.

Results
As mentioned above, the modified combinants C j for the BD with generating function oscillate with a period of 2, whereas, as shown in Fig.  2 (a), the amplitudes of these oscillations depend on the probability emission p. To control the period of the oscillations one has to compound this BD with some other distribution. Fig. 2 (b) shows an example of using for this purpose a Poisson distribution with a generating function given by Eq. (23) (for which C 0 = 2 and C j>0 = 0). The generating function of the resulting Compound Binomial Distribution (CBD) is Fig. 2 (b) shows the C j obtained from such a CBD with K = 3 and λ = 10 and calculated for three different values of p in the BD: p = 0.54, 0.62, 0.66. Note that, in general, the period of oscillation is now equal to 2λ (i.e., here, where λ = 10, it is equal to 20). However, such a CBD lacks the fading down feature of its C j and therefore cannot fit the results presented here. As shown in [5] the situation improves substantially when one uses a multi-CBD based on Eq. (26), but even then the agreement is not satisfactory. The situation improves dramatically if one replaces the Poisson distribution by a NBD and, additionally, allows a two-component version of such a CBD in order to gain better control over both the period and the amplitudes and on their behavior as a function of the rank j, and 1 This is because the sum of M variables, each from the NBD characterized by parameters (p, k), is described by a NBD characterized by (p, M k). In the case where M = 1, . . . , K is distributed according to a BD, we have a K-component NBD (where consecutive NBD have precisely defined parameters k), P (N ) = K M=0 P NBD (M )P NBD (N ; p, M k), which naturally leads to the appearance of oscillations.

The generating function of such a CBD is
As one can see in Fig. 2 (c − d) in this case (with K 1 = K 2 = 3, p 1 = 0.7, p 2 = 0.67, k 1 = 4, k 2 = 2.3, m 1 = 6, m 2 = 19.0 and w 1 = w 2 = 0.5) one can nicely fit both the P (N ) and C j . Of special importance is the fact that the enhancement P (0) > P (1) is also reproduced in this approach. This is best visible in Fig. 1(d) which concentrates on the region of small N only. This result is presented there in comparison with the results of a number of other, seemingly very good, fits based only on some combinations of NBD (and not using the BD), which are not able to reproduce this feature of the data. Summarizing this part: it turns out that to describe all aspects of the data on multiparticle distributions one has to use a multicomponent compound distribution based on the BD (which is responsible for the oscillations in C j ) which is compounded with some other distribution providing damping of the oscillations for large N (in this example it is a NBD). In Fig. 2 (e) we show the results of such approach applied to data taken at different energies (in rapidity window |η| < 3). Fig. 2 (f ) shows another intriguing property of the modified combinants, namely that both their periods and amplitudes increase with the width of the rapidity window in which the data on the resulting P (N ) were collected. A more detailed picture of this phenomenon is presented in Fig. 3 showing the description of the multiplicity distributions (NSD events at 7 TeV) measured by ALICE [30] for three different rapidity windows: |η| < 2, |η| < 3 and −3.4 < η < 5. The most intriguing feature observed is the rather dramatic increase of both the period of the oscillations and their amplitude with the width of the rapidity window used to collect the data and, most noticeably, the previously observed fading down of their amplitude is now replaced by an (almost) constant behavior (for |η| < 3) or by a rather dramatic increase (for −3.4 < η < 5). Because N ∼ ∆η, some part of this increase could be caused by the increase of N with ∆η, the rest expects an explanation. However, in general, with increasing ∆η both probabilities, p and p ′ , are increasing which results in an increase in the amplitudes of the C j with rank j.
So far both components are based on the same BD with K 1,2 = 3. We have checked that one can safely increase the parameter K 2 in the second component while keeping practically all the parameters of the first component the same and, for appropriately selected other parameters of the second component, we get results essentially indistinguishable from those presented [30]. (d)-(f ) The corresponding modified combinants C j emerging from them fitted using a two compound distribution (BD+NBD) given by Eqs. (28) and (27) with parameters listed in Table 1. Table 1 Parameters w i , p i , K i , k i and m i of the 2-component P (N ), Eqs. (28) and (27), used to fit the data in Fig. 3. For completeness the corresponding p ′ i = m i /(m 1 + k i ) from Eq. (28) are also included.  Fig. 3 (for example, for the ALICE data at 7 TeV and |η| < 2, this can be done for K 2 = 3, 6, 9, 12, 20 and (p 2 , m 2 ) such that K 2 p 2 m 2 = const ≃ 32, which means that for the second component N BD N N BD ≃ 32). So far we cannot offer any convincing explanation of our findings. At the moment the rough idea could be, for example, that the two components correspond to a quark-quark interaction (therefore K = 3) and to a gluon-gluon interaction (in this case K could be different, as mentioned above) 2 .

Summary and conclusions
We presented an approach in which one can simultaneously reproduce such features of the observed mul-2 To be more specific, one can try to estimate the number of a "hard" gluons participating in the interaction (which is equivalent to K 2 ). For example, in [31] it was estimated as N G = 2.84 ln( √ s) − 11.45, which for an energy of 7 TeV gives N ∼ 14.
tiplicity distributions as: their shape as a function of the multiplicity, P (N ), the peculiar properties of the observed void probabilities, P (0) > P (1), and, finally, the behavior of the modified combinants, C j , which can be deduced from the measured P (N ). In particular, we have shown that the most popular type of multiplicity distribution, the NBD, cannot alone describe the data on P (N ). The 2-component NBD can describe the data on P (N ) but fails to describe the observed oscillatory pattern of the C j obtained from them. These two features can be fully reproduced by multicomponent NBD models (like, for example, the 3-NBD proposed in [22]), but such models do not reproduce the property that P (0) > P (1). In fact, none of this class of models reproduces it (as long as it does not take into account also contributions from the singlediffractive and double-diffractive events [33]). On the other hand, the compound distributions discussed in [2,3], which were specially designed and tuned to describe the P (0) > P (1) property, do not reproduce the oscillatory behavior of the corresponding modified combinants, C j . This is because they belong to the group of infinitely divisible distributions (for which C j are positive for all ranks j [9]). Our approach is based on the compound distribution, CBD, with the main role played by the BD. It is responsible for the oscillatory behavior of the modified combinants, C j , and is compounded with a NBD which is responsible for the amplitudes and periods of these oscillations (depending on the experimental circumstances, they may be declining or growing with different periods). The lack of oscillatory behavior of the C j deduced from the NBD can then be attributed to the fact that the NBD is itself a compound distribution of the Poisson and logarithmic distributions, and compound distribution of a Poisson with any other distribution always results in non-oscillating (in fact, exponentially fading down) C j . On the other hand, the emergence of the oscillatory behavior of the multi-NBD can be attributed to the fact that a sum of NBDs is, under some conditions, equivalent to a compound distribution of a BD with a NBD.
In summary, we believe that the modified combinants, C j , deduced from the measured multiplicity distributions, P (N ), together with the already measured void probabilities, could provide additional information on the dynamics of the particle production. This, in turn, could allow us to reduce the number of possible interpretations presented so far and, perhaps, answer some of the many still open fundamental questions. Experimental measurements of C j (or, rather, presenting them together with the already measured P (N )), appear in this context as a new important necessity.
A detailed discussion of the sensitivity of modified combinants C j to the uncertainties of measurement is given in [32]. Here we present only some remarks on the estimation of errors of C j based on the published data and on the Monte Carlo simulations.
To summarize, as shown in Fig. 4, one observes that statistical errors cause only some chaotic spread of the measured C j but do not result in periodic oscillations. In the case of the monotonic behavior of C j as function of the rank j (for example, when one uses NBD) one gets no oscillations from errors. However, in the case when one observes oscillations, systematic errors can actually blur the whole picture of oscillation (making them invisible). The most important point is that oscillations of C j are highly correlated, cf. Eq. (A.1) (they are not chaotically scattered). Statistical errors do not give such oscillations. Fig. 4(a) shows values of the C j for a NBD. Note that for large statistic (comparable to that in ALICE for 7 TeV where we have 3.437 × 10 8 selected events [30]) we obtain very smooth dependence on the rank j. On the other hand, making analytical estimates of the size of the error one gets that Note that the last term of Eq. (A.1) introduces dependence of the error in C j to errors of the all previous coefficients C i<j . This results in the error cumulation effect leading to a significant increase of errors with increasing rank j, as can be observed in Fig. 4 We end with estimation of the statistical significance of the oscillating behaviour of modified combinants C j . This can be done using the periodogram-based Fisher g-statistic test [34,35]. This test determines whether a peak in the periodogram is significant or not and it proceeds as follows. Given a series y(j) = N C j of length L, the periodogram I(ω) is first computed as It is evaluated at the discrete normalized frequencies  Errors of N C j evaluated using the systematic and statistical uncertainties of P (N ) given by ALICE [30]. (c) Monte Carlo evaluated coefficients N C j emerging from the systematic and statistical errors of P (N ). The curve presented here denotes the fit to the original coefficients C j obtained from the measured P (N ), it is not the fit to the points shown. (d) For the same data as before the errors were evaluated assuming only statistical uncertainties of the measured P (N ) with a poissonian distribution of events in each bin, i.e., V ar[P (N )] = P (N )/N stat . Note that in this case statistical errors do not give any noticeable errors of C j . (e) Monte Carlo evaluated coefficients N C j with only statistical errors of P (N ) accounted for. The continuous curve represents the fit to the original coefficients C j obtained from the measured P (N ). (f ) The modified combinants C j emerging from the ALICE data on P (N ) [30]  where a = [(L − 1)/2] and [x] denotes the integer part of x. If a series has a significant sinusoidal component with frequency ω k , then the periodogram will exhibit a peak at that frequency. Fisher derived an exact test of the significance of the spectral peak by introducing the In Fisher's test, one is testing the null hypothesis, H 0 , that the spectral peak is statistically insignificant against the alternative hypothesis, H 1 , that there is a periodic component in the signal y(j). Under the Gaussian noise assumption, the exact distribution of the g-statistic under the null hypothesis H 0 is given by In in Fig. 5) we present normalized periodogram I(ω) for N C j calculated from ALICE data discussed here [30]. Note large observed value of the g, a peak in the periodogram, which indicates existence of a strong periodic component and leads us to reject the null hypothesis. The probability that the spectral peak is statistically insignificant is 10 −16 .Therefore, similarly as in [32], we can conclude that notwithstanding the large sensitivity of oscillations of the modified combinants to systematic uncertainties of the measurements of P (N ), they show enough power to disclose the fine details of experimentally measured multiplicity distributions, and can shed a new light on the dynamics of multiparticle production processes.