AMS-02 antiprotons are consistent with a secondary astrophysical origin

Mathieu Boudaud, ∗ Yoann Génolini, † Laurent Derome, Julien Lavalle, David Maurin, Pierre Salati, and Pasquale D. Serpico LPTHE, Sorbonne Université, CNRS, 4 Place Jussieu, F-75005 Paris, France Service de Physique Théorique, Université Libre de Bruxelles, Boulevard du Triomphe, CP225, 1050 Brussels, Belgium LPSC, Univ. Grenoble Alpes, CNRS, 53 avenue des Martyrs, F-38000 Grenoble, France LUPM, Université de Montpellier, CNRS, Place Eugène Bataillon, F-34000 Montpellier, France LAPTh, Univ. Grenoble Alpes, USMB, CNRS, F-74000 Annecy, France

Introduction -The spectrum of Galactic cosmic-ray (CR) antiprotons (p's) originating from interactions of CR protons on the interstellar (IS) gas was first calculated fifty years ago [1,2]. CRp's were discovered a few years later in balloon-borne experiments [3,4], but in excess compared to this expected secondary background; this was soon interpreted as the presence of an extra contribution, possibly from primordial black holes [5] or dark matter (DM) annihilation [6,7]. Over the past thirty years, a lot of progress has been made both theoretically and experimentally.
In the following decade, the next generation of instruments-BESS-TeV/Polar [25][26][27] and the PAMELA satellite [28][29][30]-measured several thousands ofp's up to a few hundreds of GeV. Updates of the model parameters and corresponding predictions, using better measured p and He CR spectra, yielded a secondary flux in agreement with the data [e.g. 31].
The state-of-the-art AMS-02 experiment has been operating on the ISS since 2011. It has already recorded several tens of thousands ofp's [32], up to TeV energies. For the first time, the measurement is dominated by systematic uncertainties (at a few percent level), implying new challenges for their interpretation. Here we address this issue, going beyond previous analyses of preliminary AMS-02p's data [33][34][35]. * boudaud@lpthe.jussieu.fr † yoann.genolini@ulb.ac.be We underline that CRp's are one of the most sensitive astroparticle probes of annihilating/decaying DM in the GeV-TeV range [36][37][38][39][40], and any constraint on DM candidates depends on how well the astrophysical background is controlled. This is especially important as claims forp excesses attributed to DM are being debated [41][42][43][44][45].
Methodology -The flux of CRp's at Earth depends on (i) the cross sections entering their production, scattering, and annihilation, (ii) the CR propagation model, (iii) the IS spectrum of the (most abundant) CR nuclei, and (iv) modulation of fluxes in the Solar cavity.
Uncertainties on relevant nuclear cross sections are among the dominant ones for thep flux calculation [22,33]. Recently, new data have been taken [46][47][48], leading to improvements in the cross section parametrisations [49][50][51][52][53]. A crucial ingredient needed for the calculation is the Lorentz invariant cross section of prompt p's produced in pp interactions. We make use of the parametrisation proposed in Ref. [52] and improved in Ref. [53] (Param II). For nucleon-nucleon interactions, we use the scaling relation B proposed in Ref. [53] which provides the best agreement with LHCb data. We use the covariance matrices of errors on the parameters to propagate the uncertainties to thep flux calculation. Antineutrons (n's) and antihyperons are produced in hadronic interactions and decay intop's; their contribution has to be included in the total cross section. The yield ofp's produced vian's is larger than in the promptp channel, an energy-dependent effect studied and parameterised in Ref. [52], in agreement with the very scarce experimental data. We introduce an exponentially decreasing energydependent parametrisation in order to reproduce the 1σ CL interval of Fig. 8 in Ref. [52]. Regarding the antihyperon contribution, we make use of the parametrisation determined in Ref. [52] and the covariance matrix of the corresponding parameters to propagate the uncertainties on thep flux calculation. More details on the nonprompt channels and on the way we propagate the related uncertainties are given in the App. A 1 and App. A 2, respectively. Inelastic annihilating and nonannihilating interactions ofp's with the IS medium (ISM) are treated following the procedure described in [38].
We use the 1D diffusion model implemented in usine v3.5 [54] which assumes a thick diffusion halo size L, and a thin disc containing the sources and the gas. Computationally more expensive scenarios with extended geometries [50,55,56] and time dependence [57] have been considered in the literature. Yet, should simple 1D models prove sufficient to account for the data in a secondary production scenario, more complex scenarios would a fortiori work as well. We use the most generic transport model defined in [58] (called BIG) where the transport parameters are fitted on B/C following the methodology described in [59], i.e. a model of the covariance matrix of B/C AMS-02 errors has also been incorporated in the fit. Using the minuit [60] package to fit the B/C, we compute the best-fit values of the free parameters (transport and nuisance parameters) and their covariance matrix. We fully propagate the transport uncertainty to thep flux from the covariance matrix.
Important inputs for thep calculation are the IS fluxes for all progenitors (mostly H, He, C and O, but also all nuclei up to Fe, see App. B 3 for their detailed contributions; we also account for heavy elements in the ISM, see App. B 4). There are two options to propagate the associated uncertainties: a standard approach in semianalytical models [22,31,33,52] is to demodulate topof-atmosphere (TOA) data, in order to extract the parameters describing the IS fluxes and their uncertainties, which allows to yield thep flux with a fast procedure and to easily propagate the progenitors uncertainties. The preferred approach in fully numerical propagation codes is to directly use the IS fluxes calculated at the propagation stage, ensuring that the calculated fluxes fit the data [23,35,55,61]. In this global fitting, the transport and source uncertainties are determined simultaneously. Although using a semianalytical model, we follow this second approach but do not rely on a global fitting of the transport and source parameters. Instead, we follow and extend the two-step procedure detailed in [58] (see Sec. III therein): we start from the best-fit parameters obtained from the B/C analysis (BIG) and then perform a simultaneous fit of H [62], He, C, and O [63] AMS-02 data to determine the source parameters, i.e. four normalisations ( 1 H, 4 He, 12 C, and 16 O) and three slopes (α H , α He , and a universal source slope α Z>2 for all other species). Source isotopic fractions are fixed to Solar System values [64]; the abundances of non-fitted elements up to Z = 30 are fixed such as to match the HEAO-3 data [65] at 10.6 GeV/n. We fit at the same time the high-rigidity diffusion break parameters which are better constrained by elemental fluxes than by the B/C [58]. We include in the fit the covariance matrices of errors on H, He, C, and O data, see App. B 1. The outputs of this fitting procedure are discussed in App. B 2. Note that we fix the Fisk potential 1 φ FF for H, He, C, and O data to the value yielded by the B/C data fit [58]. While published AMS-02 data are from the same time period for B/C, He, C, and O (05/2011 to 05/2016), they originate from a shorter period for H (05/2011 to 11/2013), so that the associated modulation level should be slightly different. However, as the fit is restricted to data above thē p production threshold (E k/n > 6 m p [22]), the impact is feeble, and in any case negligible compared to other uncertainties.
State-of-the-art modulation models are 2D or 3D charge-dependent diffusion and drift in the Solar cavity [71]. They go beyond the force-field approximation [72] and at variance with the latter, they predict different modulations for positively-and negatively-charged particles. However, these models have a large number of parameters and are still under study. Moreover, the impact of the drift effect is strongly dependent on the data-taking period. The AMS-02p data [32] were analysed from a time period of 4 years (May 2011 to May 2015) longer than that of H and shorter than that of He, C, and O data. As illustrated in [73], drift models can be used to derive effective and different Fisk potentials for nuclei andp's, with φp FF which is not clearly smaller or larger than φ nuc FF on a sufficiently long period. Assuming the same modulation forp and H, He, C, and O actually already gives a satisfactory description at low rigidity. For this reason, we did not include further uncertainty on φp FF . If anything, this would slightly enlarge the uncertainty of the prediction at low rigidity, improving further the consistency between our calculatedp flux and the data.
Uncertainties in the parameters discussed above should be propagated to thep flux calculation. To proceed, for each source of uncertainty (production cross section, transport in the Galaxy, fit to parents fluxes) we draw ∼ 10000 realizations of the parameters from their covariance matrix and we compute the correspondingp fluxes. We then calculate the 1σ confidence-level (CL) envelope for each source of uncertainty as well as the covariance matrix of the model C model (see App. C). This enables us to soundly assess the compatibility of the model with the data.
In the context of the B/C analysis, we stressed the importance of using a realistic covariance matrix of the data errors to avoid misleading conclusions [59]. We anticipate that the same is true forp's. However, since this matrix is not directly provided by the AMS-02 collaboration, we build it from the published systematic errors and associated description of their physics origin, in the same spirit as in [59].
The various contributions to the AMS-02 systematics are broadly described in [32]. For instance, the text   [74]. Individual contributions in the systematic errors namely rigidity cut-off (Cut-off), selection (Sel.), template fitting (Templ.), cross sections (XS), unfolding (Unf.), rigidity scale (Scale) and acceptance (Acc.), built from information provided in [32] are shown (coloured lines) before (thin) and after (thick) the rescaling applied to match the total systematic error.
"This [selection] uncertainty amounts to 4% at 1GV, 0.5% at 10GV, and rises to 6% at 450 GV" is interpreted as a piecewise power-law behaviour, and is shown as an orange thin line in Fig. 1. Thus we build the seven sources of systematics quoted by the AMS-02 collaboration (coloured thin lines), where the quadratic sum of all contributions leads to the black thin line. In order for the sum to match the total systematic errors provided in [32] (black thick line), we rescale for each rigidity point our separate contributions by the ratio of the thick to the thin black lines. This leads to our model for the AMS-02 p systematics (coloured thick lines). The covariance matrix associated with these systematics is then built based on a choice of their correlation length, . More details on this procedure are given in App. C and, for the B/C analysis, in [59]. Forp's, we take as educated guesses for the correlations lengths (in unit of energy decade) Acc. = 0.1 (acceptance), Cut = 1.0 (rigidity cut-off), Scale = 4.0 (rigidity scale), Templ. = 0.5 (template fitting), XS = 1.0 (cross sections), Unf. = 1.0 (unfolding), and Sel. = 0.5 (selection). Fig. 2 shows our baselinep flux prediction (not a fit) obtained from the bestfit values for thep production cross sections, the transport (BIG) and the associated parents fluxes, compared with AMS-02 data with errors taken as the quadratic sum of systematic and statistical errors (black crosses). The 'standard' residuals with respect to the baseline model are displayed in the middle panel. Note that the points do not include the model uncertainties, nor correlations in the data uncertainties. We also show on the same plot the 68% total confidence band for the model (grey band). This band could release the tension with the data, even before accounting for the information on the correlations in rigidity bins. The respective contributions of parents, cross sections and transport are also plotted. At tens of GV, the errors from transport and cross sections are almost constant and close to 10%. At larger rigidities, the errors from transport and parents increase because of the increasing experimental uncertainty in the B/C ratio and parent fluxes, respectively. At low rigidity, the error from transport grows for the same reasons and encompasses the uncertainty in the prediction of the low-rigidity behaviour (see App. E). However, we remind the reader that a visual comparison can be deceiving: the presence of non-diagonal values in the covariance matrices is responsible for a better agreement between the model and data than perceived in the residuals (see Table I).

Results -The top panel of
To test the actual compatibility of our prediction with thep data, we present two statistical tests which boil down to probabilistic statements in terms of p-value (see App. D). First, we propose a χ 2 test, with the help of a covariance matrix of errors on both data and model: The covariance matrices of the data C data and the model C model are given by the sum of the different contributions previously detailed (see App. B 1 and App. C). We find χ 2 ≈ 44, and identifying the number of degrees of freedom (dof) with the number ofp data points (57), we infer a corresponding p-value of 0.9 which is reported in the last line of Table. I. Such a test does not directly assess a possible overestimate of the errors, and also relies on the notion of number of dof (which may be a shaky concept in some circumstances, see e.g. the discussion in [75]). Thus, we also perform a Kolmogorov-Smirnov (KS) test, which obviates the above limitations. We compute the distribution of the 'eigen residuals' (z-score) corresponding to the residuals of the eigen vectors (data-model) of the total covariance matrix. In the bottom panel of Fig. 2 we show these 'eigen residuals' as a function of 'rigidity' (actually, the one rotated in the eigen basis, see App. D), and in the inset, we compare the corresponding histogram with a Gaussian. The KS test leads to a p-value of 0.27 which is also very good and is clearly consistent with the hypothesis thatp's are of secondary origin. For completeness, we also report in Table. I the p-values when considering different combinations of errors: (i) If there was no uncertainty in our baseline model, the covariance matrix of data errors alone (C data ) would already give enough freedom to allow for a very good agreement between the data and the secondary flux prediction; (ii) Considering only the statistical uncertainties in the data and the uncertainties in the model (σ stat and C model ), this prediction is marginally consistent with the data at the 2σ level, with the KS test leading to an even better p-value. Also note the relevance of the KS test (as opposed to the χ 2 test) to spot error overestimates, in the case of σ tot and C model ; (iii) In the most realistic case considering both C data and C model , p-values are acceptable for both the χ 2 and KS test. Thus, not only is a secondary origin for the locally measuredp's statistically consistent with the data, but, as shown by these considerations, it is also robust with respect to error mismodelling in either model or data errors. Conclusions -Percent-level details in the model predictions now matter, as do more subtle aspects of the data error treatment. In this Letter we have presented a major upgrade of thep flux prediction and analysis by: (i) using the latest constraints on transport parameters from AMS-02 B/C data, (ii) propagating all uncertainties (with their correlations) on the predictedp flux, (iii) accounting for correlated errors inp data. With these novelties, we unambiguously show that the AMS-02 data are consistent with a pure secondary astrophysical origin. We stress that this conclusion is not based on a fit to the AMS-02p data, but on a prediction of thep flux computed from external data. Our results should hold for any steady-stade propagation model of similar complexity, as they all amount to the same "effective grammage" crossed to produce boron nuclei (on which the analysis is calibrated), with roughly the same grammage entering the secondaryp's. More elaborate models would be less constrained and thus would make the agreement even better.
On the technical aspects, more computationally expensive methods could allow one to go beyond the quadratic assumption (i.e. assuming multi-Gaussian error distributions) embedded in the covariance matrix of errors. For more advanced applications, sampling techniques like Markov chain Monte Carlo could be used (e.g., [76]). However, a significant improvement in our perspectives for DM searches in thep flux can only be achieved by simultaneously reducing the systematics in the data and the errors of the modelling. On the data side, a covariance matrix of errors directly provided by the AMS-02 collaboration would definitively be an important improvement to fully benefit from the precision achieved by AMS-02. On the modelling side, the next step would be to combine more secondary-to-primary ratios (Li/C, Be/C, and B/C) to further decrease the propagation uncertainties. Of course, better data and modelling onp andn production cross sections is also required, and the sub-leading error due to primary source parameters could be reduced by combining AMS-02 data with higher energy data from CREAM, TRACER and CALET [77].
Acknowledgements -MB is grateful to Michael Korsmeier and Martin Winkler for very useful discussions. We are grateful to all the members of the Cosmic Rays Alpine Collaboration. This work has been supported by the "Investissements d'avenir, Labex ENIGMASS", by Univ. de Savoie, appelà projets: Diffusion from Galactic High-Energy Sources to the Earth (DIGHESE). The work of Y.G. is supported by the IISN, the FNRS-FRS and a ULB ARC. We also acknowledge a partial support from the Agence Na- Most of thep's are produced via decays of antihyperons andn's. These contributions are not included in the promptp cross section measured by the collider experiments used in Ref. [53]. Ref [78] proposed energydependent parametrisations based on the most up-todate collider data. The totalp production cross section is where σ inv is the Lorentz invariant cross section of promptly producedp's, ∆ IS and ∆ Λ the isospin asymmetry and anti-hyperon corrections, respectively.

Antihyperons
A sizeable fraction ofp's is produced via the decay of antihyperonsΛ andΣ. The ratio of hyperon-induced to promptly producedp's is ∆ Λ = (0.81±0.04)(Λ/p), wherē Λ/p is the ratio ofp's produced via the decay ofΛ to the total yield. Here, it is assumed that antihyperons decay equally intop's andn's. The energy dependence ofΛ/p is given byΛ where √ s is the centre-of-mass energy, c 1 = 0.31, c 2 = 0.30, c 3 = (146 GeV) 2 , and c 4 = 0.9. We determine the uncertainty by randomly drawing 10000 realisations of the parameters from the covariance matrix. This method enables us to reproduce the median and the 1σ CL obtained in Ref. [78]. They are shown in the top panel of Fig. 3.

Antineutrons
Antiprotons are preferentially produced via the decay ofn's. The energy dependence of the isospin asymmetry proposed in Ref. [78] is motivated by the fact that a pion has to be produced whenp's are promptly produced. We found that taking the parametrisation and the covariance matrix of parameters from Ref. [78] does not allow to reproduce the median and the 1σ CL interval the authors show in Fig. 8. Instead, we introduce the parametrisation where x = log( √ s), c 0 = 0.33, c 1 = 6.42, c 2 = 0.5 and c 3 = 1.6. We randomly draw 10000 realisations of c 0 using a Gaussian distribution with a standard deviation of σ c0 = 0.36 and with the lower bound c 0 > 0.04. This way, we recover the median and the 1σ CL interval of Fig. 8 in Ref. [78]. The results are shown in the middle panel of Fig. 3. Since we adopt the nucleon-nucleon scaling relation determined in Ref. [53], which depends (though slightly) on a different parametrisation of ∆ IS , we rescale the parameter D 2 entering Eq. (17) of Ref. [53] in order to obtain the same value of f A1A2 at the energy of NA49. The ratio between σ tot inv as expressed in Eq. (A1) and σ inv is shown in the bottom panel of Fig. 3. In the case of vanishing nonprompt contribution and isospin violation, one would expect a constant value equal to 2. Hence, not only thep flux receives an upwards correction of ∼ 20% to 50%, but also acquires peculiar energy-dependent features, further affected by relatively large uncertainties.
Appendix B: Antiproton parents

AMS-02 data errors
In H, He, C, and O fluxes published by the AMS-02 collaboration [62,63], uncertainties are broken-down into several contributions: statistical (Stat.), unfolding (Unf.), rigidity scale (Scale), acceptance (Acc.), and even trigger (Trig.) for protons data. As discussed in [59], the information provided in the Supplemental Material of AMS-02 publications allows us to build a first estimate of the covariance matrix of uncertainties C α . For each uncertainty α (Stat., Unf., Scale, Acc., and Trig.), the ij-th element of the matrix is taken to be where σ α i is the systematic uncertainty of type α taken at rigidity bin R i , and α is the correlation length indicating the range over which rigidity bins are correlated.
As the data analysis of AMS-02 nuclei follows many similar steps, the correlation lengths derived in [59] for B/C data [74] should not be too different from the ones for the proton [62] and He, C, and O data [63]. Consistently with [59], we take: • Stat. = 0 (statistical uncertainties are uncorrelated); • Scale = ∞ (rigidity scale impacts all rigidities);  We also show the ratio between total production cross section and promptly producedp cross section (bottom panel).
Note that Acc. norm is not well constrained, while it was found to be critical for the B/C analysis [59]. Its value relies on indirect consistency arguments (analysis of transport parameters), and we refer the reader to [59] for the discussion and caveats about this choice. The above numbers and various systematics (shown in Fig. 4) are plugged in Eq. (B1) to generate the covariance matrices for H, He, C, and O. These matrices are used for the determination of source parameters.

Primary CR source parameters from the AMS-02 H, He, C, and O data
As discussed in App. C, for each propagation configuration (BIG, SLIM, and QUAINT, see [58] and also Sec. E), we fit the source parameters and the high-rigidity break of the diffusion coefficient on H, He, C, and O AMS-02 data, relying on their covariance matrix of errors introduced in Sec. B 1. As we are only interested in the region where these species producep's, we limit the fit to data above the threshold production E k /n = 6m p . As the H, He, C, and O nuclei are the dominantp progenitors (see Sec. B 3), we check here that good fits are achieved in order to ensure the reliability of ourp prediction. Table II gathers the outputs of the fits, i.e. the χ 2 /dof values, best-fit values for the sources parameters (4 normalisations and 3 slopes) and high-rigidity diffusion break parameters (break slope, position, and sharpness). First, the values obtained here for the diffusion break parameters and source slope α Z>2 are consistent with those of the analysis carried out in [58] based on the C and O nuclei only (and only using total errors instead of the covariance matrix of errors in the fit). The additional information brought from this analysis is a comparison of the H, He, and Z > 2 source spectral indices 2 . While this is not the goal of this study, it is worth mentioning that all models prefer a slightly harder spectral index for He than for heavier species, ∼ 0.02 harder (uncertainties, not shown in the  [62,63], namely those coming from the statistical treatment, the acceptance, the absolute energy scale, and the unfolding procedure (and also from the trigger for protons). The black lines correspond to a further splitting in the acceptance errors-normalisation (norm.), low energy (LE), and residual (res.)-as discussed in the text. The step-like evolution is due to rounding errors from the published tables. primary elemental data 3 with χ 2 /dof = 1.93. Figure 5 shows further details with a graphical view of the goodness-of-fit for the H (blue), He (red), C (green), and O (orange) nuclei data in and outside the fit regions (grey shaded area on the left and centre panels). The left panels show z-scores, i.e. the residuals measured in unit of σ tot , (model-data)/σ tot , for the BIG (top), the SLIM (middle), and the QUAINT (bottom) propagation model configurations: a good fit implies that 68% of the points should be distributed within 1σ tot . However, this representation does not account for correlations in the systematic errors. As discussed in Sec. D, it is more interesting to show the same quantity in a rotated basis in which the covariance matrix of errors is diagonal (calledz-scores) 4 . Thez-scores shown in the second column of Fig. 5 no longer display the spectral features seen in the left panels. This illustrates the importance of properly visualising scores with the covariance accounted for, 3 The bad fit of H data in QUAINT is understood as this model has high levels of reacceleration and convection, which affects more strongly A/Z = 1 species (i.e. H) than A/Z ≈ 2 species (He, C, O). 4 The price to pay is that now the matrix of rigidity positions is no longer diagonal, and the value we pick to represent in the middle panels of Fig. 5 is the rotated z-score (dubbedz-score) defined in Sec. D.
as we may otherwise be visually biased by these structures. The right panels of Fig. 5 are the histograms of thez-score points displayed in the middle panels. They indicate that whenever a good χ 2 /dof value is obtained, the normalised residuals follow a 1σ-width Gaussian distribution (solid black line). These plots provide further details on the separated contributions from H, He, C, and O data: in all cases, the H data are always less well fitted (broader distribution), especially in the QUAINT model configuration, which explains the bad χ 2 /dof.
In the light of this study, we correctly reproduce the H, He, C, and O AMS-02 data (except for the QUAINT model configuration) which gives good confidence in the self-consistency of thep flux calculation.

Ranking parent contributions
From the BIG model configuration, we show in Fig. 6 the fractional ISp contribution per parent element, properly accounting for each CR isotopes (and summed over all ISM components, here H and He). We wish to highlight a few features in this plot: the most significant channels are those involving the most abundant CR fluxes, i.e. H (∼ 80%), He (∼ 20%), then O, C, Fe, Si, and Mg (∼ 1 − 3%), followed by Ne and N (∼ 0.5%). All the remaining ones give sub-percent contributions, but we stress that all the contributions below 0.2% (grey shaded area) add up to a total of ∼ 1% (solid black line). Also, if we only account the contributions of elements heavier than Si, they add up to a total of ∼ 2% (solid red line).
The contributions of the two most important elements (H and He) are broken down into their isotopic content (blue and green solid lines for H and He isotopes respectively), highlighting the contributions from primary species ( 1 H and 4 He) w.r.t. sub-dominant secondary ones ( 2 H and 3 He, lighter-coloured lines).
Focusing on the rigidity dependence, the feature at a few GV is related to secondary species: the peaks/dips seen at these rigidities directly reflect the peaks in secondary-to-primary ratios (e.g., B/C, 3 He/ 4 He, etc.). Also the spectra of secondary species are one δ (slope of the diffusion coefficient) steeper than those of primary species, so that their contributions fall off with rigidity. Among the primary species, a difference is observed between H and all heavier elements: H has a steeper source spectrum than the other species (see Table II), so that its relative contribution w.r.t. these elements decreases with rigidity.
For the sake of illustration, we show in Fig. 7 that these fractional contributions are not very sensitive to the transport configuration considered (BIG, SLIM, and QUAINT setups, see Sec. E). This is not surprising as all three are tuned to match the B/C, H, He, C, and O data. The main difference is observed at high rigidity in the QUAINT configuration, because in this model, contrary to the BIG and SLIM ones, H and other species have a similar source spectral indices (see Table II in Sec. B 2).

Impact of heavy species in the ISM
To be able to perform tens of thousands fits of the H, He, C, and O data together with the associatedp flux predictions, almost all our calculations were carried out restricting ourselves to CR parents up to 30 Si. Moreover, a standard practice in the field is to assume that the IS gas is only made of H and He. However, in the context of high-precisionp data, it is important to reassess the impact of heavier species in both CRs and in the ISM, to control the accuracy of thep flux calculation. For instance, the fraction of CNO in the ISM is estimated to be ∼ 0.5 − 1% [21,53,80], whereas that of heavy species in CRs (e.g. Fe) is estimated to be 1% [53] (see Fig. 6).
Before coming to our prescription to account for these contributions, a few remarks are in order, since 'heavy' CRs and ISM enter the calculation at several steps: to extract the transport parameters from the B/C analysis, in the calculation and fit of H, He, C, and O fluxes, and then in thep flux calculation. Our reference calculation here is based on the transport configuration BIG (see Sec. 11 of [58] for more details), and the best-fit transport parameters proposed in [58] only consider CR parents up to 30 Si with H and He only in the ISM. Hence, for a self-consistent calculation, we must refit the B/C (using the same procedure as in [58]), to extract the transport parameters for the same CR and ISM content as for thē p propagation. The last missing piece is the detail of the implementation of the ISM composition and cross sections for our calculations. The IS gas composition and species abundances are taken up to Fe from [81], the production cross sections forp's were already discussed in the main text and in Sec. A, and for the nuclear inelastic and production cross sections, we rely on a A 0.31 target scaling relation proposed in [82] for 3 He production.
We show in Fig. 8 the ratio of a given TOA calculation to our referencep flux. To better understand each effect, we show separately the impact from the 'propagation stage' and from the fully consistent calculation: in the former case, we keep the same CRs and ISM as in the reference for thep flux calculation, but use the best-fit transport parameters with 'heavy' CRs and/or ISM species, whereas for the latter case, 'heavy' CRs and ISM species are self-consistently used at all stages of the calculation. shape. Intuitively, these changes go in the right direction, as opening more channels for boron production means less grammage to get the same B/C data, and thus lessp's. The fine details are difficult to track down as they involve a mixture of increased production (CRs and ISM species) balanced by an increased destruction on the heavier ISM species (more efficient both at lower rigidities and for heavier species).
• 'Fully consistent' calculation (thick solid lines): accounting for heavy CRs and heavy ISM species in the transport parameters and in thep flux calculation leads to an overall impact of ∼ 1 − 2%, i.e. smaller than in the 'propagation stage only' case discussed above. This is not so surprising since as we open more channels for boron production, we also open more channels forp production, the two effects mostly cancelling out 6 . Finally, accounting FIG. 6. Relative contributing fraction fZ (in percent) ofp (thick dash-dotted lines) for propagation model BIG used in this study [58]. Specific contributions highlighted in thin solid lines: (i) broken-down isotopic contribution of 1 H and 2 H (blue), and 3 He and 4 He (green); (ii) contribution from all elements Z > 14 (red); (iii) contribution from all elements whose fraction is fZ < 0.2% (black). The grey area delimits the region within which contributions are below 0.2%.
If heavy species are not consistently taken into account, they may lead to spectral features at the few percent level.
3. Because we want to have a calculation as accurate as possible, all ourp fluxes calculated from 'CRs up to 30 Si' and 'ISM = H, He' are rescaled by the solid black curve of Fig. 8; the analysis presented in the main paper thus accounts for the contribution of all relevant heavy species.  7. Same as Fig. 6 for contributions from BIG, SLIM, and QUAINT (thinner to thicker lines) transport configurations for fZ > 0.2% only. See text for discussion.

Appendix C: Correlations of data and model uncertainties
To compute the likelihood of the AMS-02p flux, we first assume the data to follow a Gaussian distribution, with covariance matrix C data , around the theoretical prediction. The various sources of experimental errors, each contributing to C data , are discussed in the main text.
Here, for each of these, we plot in Fig. 9 the corresponding correlation matrix defined as where i and j stand for rigidity and C is the covariance matrix. We notice that c α ij lies in the range from 0 to 1. The first eight panels show the rigidity correlation of the individual uncertainty contributions. The bottom right panel shows the total correlation matrix for the data, which combines the individual correlation matrices weighted by their associated uncertainties (Fig. 1 of  the main text). This latter panel shows that at low-and high-rigidities, the bins are weakly correlated (dominated by statistical uncertainties). At intermediate rigidities, the dominant uncertainties is from the Acc., hence a correlation on short rigidity ranges (bottom centre panel), but contributions from other systematics add significant larger-range correlations (white and light blue shade in bottom right panel).
The predicted fluxes also suffer from uncertainties related to model parameters. The likelihood is built assuming these fluxes also follow a Gaussian distribution, with covariance matrix each of the above-mentioned sources contributing independently to the covariance matrix C a . Strictly speaking, predictions would follow a Gaussian distribution, should they vary linearly with respect to Gaussian-distributed model parameters. As errors are small, this idealized case is not far from reality. But cuts on input parameters, such as the Alfvénic speed or the convective wind, and the presence of insuperable although small non-linearities, have motivated us to go a step further. We still stick to the Gaussian assumption, but we directly explore thē p flux behaviour at each rigidity bin i while deriving how different rigidities are correlated.
To do so, for each source a in Eq. (C2), we first draw randomly N ∼ 10 4 realizations of the input parameters, assuming they are Gaussian-distributed with a covariance matrix which we borrow from previous analyses. In the case of CR transport, for instance, we use the bestfit values and covariance matrix of the best-fit low-and intermediate-rigidity parameters as published in [58]. For each set n of input parameters, we calculate thep flux Φ a i,n at AMS-02p rigidity i. From the resulting distribution, we compute the 1σ CL flux envelope. We also infer the average prediction µ a i and covariance matrix The associated correlation matrix c a ij is defined as in Eq. (C1). It is plotted in Fig. 10 for the three sources of model uncertainties, i.e.p production cross section, CR transport and parents. Notice how strongly the different rigidity bins are correlated. This plainly justifies the use of covariance matrices, which turn out to be of paramount importance. The three sources of uncertainties show three different behaviours for their correlation matrices: (i) 'Parent' shows a strong correlation between all bins; (ii) 'Transport' shows a weaker correlation between low-( 4 GV) and high-rigidities, probably related to the low-energy break in the transport model which decouples them; (iii) 'XS' is in-between, reflecting the various cross section contributions; (iv) 'Total', as for the data case, combines the individual correlation matrices weighted by their associated uncertainties (middle panel in Fig. 1 of the main text). It is dominated by 'Transport' at low rigidities, it is a mixture of 'Transport' and 'XS' at intermediate rigidities, and a mixture of 'Transport' and 'Parents' at high rigidities. We stress that the total correlation matrix for the model is very different to that of the data (Fig. 9).

Appendix D: The likelihood tests
Our analysis aims at establishing whether or not thē p flux measured by AMS-02 is compatible with a pure secondary origin. We do not perform any fit of the data. We just gauge how likely they are, once our model for CR transport and secondaryp production is given. Al- Thep flux correlation matrices of the theoretical uncertainties arising fromp production cross section, CR transport and parents are showed in the first three panels. The overall correlation matrix (Total), associated to the covariance matrix C model of Eq. (C2), corresponds to the lowerright panel. The warmer the color, the larger the correlation. though our approach is frequentist, priors on the theoretical parameters are incorporated in the definition of the likelihood. The sources of model uncertainties are to be found inp production cross section, CR transport and parent fluxes.
The likelihood tests are performed by constructing a χ 2 that accounts for both experimental and theoretical uncertainties, with their correlations. The global covariance matrix is defined as the sum while the χ 2 is taken to be where i and j are rigidity indices. The i-th element x i of vector x is the residuals at rigidity i, i.e. the difference between the AMS-02 measurement and the baseline theoretical prediction (inferred with the best-fit CR parameters of model BIG of [58] and using full covariance errors for parent fluxes). Once the χ 2 is determined, the p−value gauges how good is the agreement between data and baseline model. A more direct yet qualititive test is provided by the visual inspection of the z-score which, at rigidity i, is defined as z i = x i / √ C ii . However, in the case of nonvanishing correlations between different rigidity bins, the z-score could be dangerously misleading. Residuals x i need actually to be rotated in a new basis where the covariance matrix becomes diagonal. Such a rotation is performed with the orthogonal matrix U and yields By construction, the covariance matrixC is diagonal, with elementsC ii =σ 2 i . The rotated z-score, dubbed z-score, is defined asz i =x i /σ i and is directly related to the χ 2 through The index i labels now the pseudo (or rotated) rigidity which we define asR with R j the physical rigidity. In the basis of the eigenvectors of the covariance matrixC, the rigidity is no longer diagonal. In our case, the rotation is small, with U close to unity. The pseudo rigidityR i is not very different from the physical value R i . Finally, by virtue of Eq. (D4), thez-score can be interpreted as the correct distance between the data and the baseline model. With our assumptions (see above), each valuez i is expected to be distributed normally, with mean 0 and variance 1. This motivates us to perform a Kolmogorov-Smirnov (KS) test on the AMS-02 data, to supplement the χ 2 analysis.
Appendix E: Uncertainties from the SLIM/BIG/QUAINT transport configurations In [58], we introduced three propagation configurations: SLIM (pure diffusion with low-and high-rigidity breaks), QUAINT ('standard' convection/reacceleration transport with high-rigidity break), both limiting cases of the more general configuration BIG (low-and highrigidity breaks, convection, reacceleration). From a statistical point of view, BIG is slightly preferred by the B/C analysis, but the other two also provide satisfactory χ 2 /dof [58]. Repeating our procedure for thep flux calculation (take the best-fit transport parameters, fit highrigidity break and source parameters on H, He, C, and O data, and calculatep), we investigate here how sensitive the results are to the propagation configuration. bottom panel shows the relative difference for the corresponding ISp flux prediction 7 . The grey shaded area in the top panel highlights rigidities below the kinetic threshold ofp production 8 . Both panels share the same rigidity axis, but to interpret these graphs, we recall that p's at rigidity R GV are created from parents whose rigidities are ten to thousand times larger. Indeed, the low-rigidity variation of thep flux between configurations can be attributed to the configurations themselves, because the few percent variation on the fit parents above 7 We recall that the different configurations have different bestfit values for φ FF , see Table II. The H, He, C, and O fits are performed assuming the same modulation level as obtained in the B/C analysis. See discussion in the main paper. 8 The production threshold is E k = 6mp, which translates into slightly different rigidities for H (A/Z = 1) and other elements (A/Z ∼ 2), but for simplicity we use a single region for all elements in the plot.
the threshold (top panel) cannot provide the 10%p variation below a few GV. At high rigidity, the 5 − 10% increase inp in the QUAINT setup seems to be related to the rise of the H 'bad' fit (discussed in Sec. B 2). However, most of the difference observed for thep flux in the SLIM configuration may be related to the different highrigidity break ∆ h (0.26 vs 0.17 in BIG, see Table II). The latter break in the diffusion coefficient appears only once in the formula of primary fluxes, whereas it appears twice for secondary fluxes: for given fluxes of parents (top panel), the high-rigidity slope of thep predictions (bottom panel) will differ by ∆ h , asymptotically.
We have two last comments on Fig. 11. First, all the fits to the H, He, C, and O data make use of the covariance matrices of errors introduced in Sec. B 1, except for the blue solid line shown in the bottom panel. The latter was calculated for the same configuration as for the reference case (BIG), but using for the H, He, C, and O data uncertainties the quadratic sum of statistical and systematic errors. This shows that a better description of the data uncertainties for thep parents amounts to a ∼ 1 − 2% impact on the calculatedp flux. Second, in the range where AMS-02 measured the TOAp flux (1 − 300 GV), the calculations differ by 20%, meaning that the inter-model variation is already larger or similar to the uncertainties on thep flux data. We also stress that the difference is not a pure normalisation one but also a spectral one; this is important to keep in mind as potential percent excesses inp data (w.r.t. to the astrophysical prediction) may vanish using another propagation model. However, other propagation configurations (SLIM and QUAINT) have different spectral residuals, highlighting once more that bumps and dips in the residuals should not be taken too seriously. The solid blue lines bracket the 68% CL on the transport uncertainties for BIG. It is worth noting that the intra-model transport uncertainty is at the same level as the inter-model uncertainty. This is not surprising since SLIM and QUAINT are actually subcases of the BIG model (see [58]).
The second and third panels of Fig. 12 show the zscore in the standard and 'rotated' basis respectively (see Sec. D). Indeed, the middle panel shows some spectral features (bumps) that the eyes would be eager to interpret as potentially significant DM induced excesses. However, the middle panel does not account properly for correlations in the data between different rigidity bins, whereas the bottom panel does. There is then no signifi- AMS-02 data (symbols), 68% CL on BIG transport parameters (blue solid lines), the SLIM and the QUAINT configurations (dashed green and dashed-dotted red lines respectively). Middle panel: z-score of BIG, SLIM, and QUAINT best-fit configurations (blue circles, green squares, and red crosses respectively) against AMS-02p data Bottom panel: same but for 'rotated' score in the basis where AMS-02p systematics are diagonal. The insert shows the samez-scores projected in an histogram, with a 1σ-width Gaussian distribution on top (solid black line). See text for discussion. cant correlation left, and the inset shows that thez-scores distribution is close to the expected normal distribution (solid black line). This shows that the three transport configurations proposed in [58] are all consistent with thē p data. For a quantitative analysis (using the BIG transport configuration), see the main paper.