Statistical Significance of Reactor Antineutrino Active-Sterile Oscillations

We performed Monte Carlo calculations of the statistical distribution of the $\chi^2$ test statistic used in the analysis of the data of the NEOS, DANSS, Bugey-3, and PROSPECT short-baseline reactor experiments. We show that the statistical significance of the NEOS and DANSS indications in favor of active-sterile neutrino oscillations is smaller than that obtained with the usual method based on the $\chi^2$ distribution. In the combined analysis of the data of the four experiments we find that the statistical significance of active-sterile neutrino oscillations is reduced from $2.4\sigma$ to $1.8\sigma$.

Active-sterile neutrino oscillations due to a nonstandard massive neutrino at the eV scale is one of the current hot topics of research that could reveal the existence of new physics beyond the Standard Model. This line of research is historically motivated by three indications of short-baseline neutrino oscillations that cannot be explained in the standard three-neutrino mixing framework (see the recent reviews in Refs. [1][2][3]): 1) the LSND observation of short-baselineν µ →ν e transitions [4] (the LSND anomaly); 2) the measurement of shortbaseline ν e disappearance [5,6] in the gallium source experiments GALLEX [7] and SAGE [8] (the gallium neutrino anomaly); 3) the short-baselineν e disappearance of reactorν e [9] with respect to the theoretical prediction of the reactorν e fluxes [10,11] (the reactor antineutrino anomaly). In particular, the discovery in 2011 of the reactor antineutrino anomaly raised a great interest and triggered the realization of several new reactor experiments searching for short-baselineν e disappearance in a model-independent way, that does not depend on the theoretical prediction of the reactorν e fluxes. This goal can be achieved by confronting the antineutrino detection rates or energy spectra measured at different distances from the reactor and fitting the ratio with the effective short-baseline (SBL) survival probability where ∆m 2 41 = m 2 4 −m 2 1 1 eV 2 is the difference between the squared mass of the nonstandard neutrino ν 4 and the squared masses m 1 , m 2 , m 3 m 4 of the standard neutrinos in the 3+1 active-sterile mixing scheme, that is the simplest one that can explain short-baseline neutrino oscillations. The amplitude of the oscillations is given by , where ϑ ee is the effective mixing angle for short-baseline ν e andν e disappearance and U is the 4 × 4 unitary neutrino mixing matrix. Different experiments are characterized by different ranges of the neutrino energy E and measure the neutrino interaction rate with detectors located at different distances L from the neutrino source. * carlo.giunti@to.infn.it Of particular importance are the results of the NEOS [12] and DANSS [13,14] experiments, that have a remarkable agreement for oscillations generated by ∆m 2 41 ≈ 1.3 eV 2 and sin 2 2ϑ ee ≈ 0.02 − 0.05 [15][16][17][18]. The statistical significance of the combined NEOS and DANSS indication of short-baseline oscillations is 3.7σ [15] using the published 2018 DANSS results [13] and 2.6σ [18] using the preliminary 2019 DANSS results [14]. This statistical significance is clearly not enough to claim a discovery, especially taking into account the decrease passing from the 2018 to the 2019 DANSS results. However, the indication is intriguing and it is worth to be examined carefully.
A source of misinterpretation of neutrino oscillation experiments is the assumption that the χ 2 test statistic (or twice the negative logarithm of the likelihood) considered in the analysis of the data follows a χ 2 distribution, according to Wilks' theorem [19]. This is often not correct and can lead to significant statistical inaccuracies [20][21][22][23]. In particular, as explained in Ref. [22], the analysis of binned data in terms on neutrino oscillations suffers from serious undercoverage for values of the mixing that are below the sensitivity of the experiment. This is due to the fact that even in the absence of real oscillations, binned data can often be fitted better by oscillations that reproduce the statistical fluctuations of the bins. In order to overcome this problem, some experimental collaborations analyze their data performing a Monte Carlo estimation of the distribution of the χ 2 test statistic. More precisely, in order to evaluate properly the allowed regions in the oscillation parameter space it is necessary to evaluate by Monte Carlo the distribution of the difference ∆χ 2 between the χ 2 in each point of the parameter space and the global minimum of the χ 2 . This is an expensive computational task that is typically performed on a grid in the two-dimensional oscillation parameter space, that involves tens of thousands of points. Instead, assuming the χ 2 distribution for ∆χ 2 , the contours of the allowed regions in the oscillation parameter space are simply given by the fixed values of ∆χ 2 in Table 39.2 of Ref. [24] for the number of degrees of freedom equal to number of oscillation parameters.
So far, the combined analyses of the data of different experiments, sometimes called "global fits", have been performed assuming the χ 2 distribution (except for the analysis of solar neutrino data in Ref. (B) The computational task for evaluating the distribution of the ∆χ 2 on a fine grid in the parameter space is prohibitive.
In particular, the combined analyses of the data of NEOS, DANSS, and other reactor experiments in terms of short-baseline oscillations [15][16][17][18] have been performed assuming the χ 2 distribution. This can lead to a serious overestimation of the statistical significance of the indication of short-baseline oscillations.
In this paper we improve the combined analysis of NEOS, DANSS, and other reactor experiments presented in Refs. [15,18] by performing the required Monte Carlo estimation of the distribution of the ∆χ 2 . We overcame the two difficulties A) and B) mentioned above by: (a) Extracting from the published experimental papers all the information that allows to analyze the experimental data taking into account the uncertainties that are shown in the figures or provided in supplementary files. This approach neglects the correlations of the systematic uncertainties of binned data when it is not publicly available. Therefore, the analysis is not "exact", but it is an acceptable approximation for the estimation of the distribution of the ∆χ 2 .
(b) Performing the Monte Carlo simulations on a reduced grid of points in the two-dimensional (sin 2 2ϑ ee , ∆m 2 41 ) plane with a few thousands of points and interpolating the distribution of the ∆χ 2 . This a good approximation, because the variations of the distribution of the ∆χ 2 are smooth.
Regarding the item a), let us further emphasize that the mentioned approximation is adopted only for the estimation of the distribution of the ∆χ 2 with Monte Carlo simulated data. On the other hand, the evaluation of the value of the χ 2 for the actual data must take into account the correlations of the systematic uncertainties of binned data, if they exist and are known. For the NEOS experiment we use the χ 2 table kindly provided by the NEOS collaboration, that takes into account all uncertainties. For the DANSS experiment we use a χ 2 without bin correlations, as done by the DANSS collaboration [13,14].
Besides the data of the NEOS and DANSS experiments, we took into account also the spectral-ratio data of the Bugey-3 [26] and PROSPECT [27] experiments 1 , that exclude large values of sin 2 2ϑ ee in a modelindependent way, as shown in Ref. [18]. In particular, 1 We cannot include in the analysis the results of the STEREO [28,29] and Neutrino-4 [30,31] experiments, because there is not enough available information. the analysis of the PROSPECT data is a useful test of our method, because the PROSPECT collaboration published the map in the (sin 2 2ϑ ee , ∆m 2 41 ) plane of the ∆χ 2 corresponding to 90% and 95% CL, that they evaluated with a Monte Carlo that takes into account all the experimental details, including the correlations of the systematic uncertainties of the binned data. They also published the map of the χ 2 of the actual data, that we use in the fit. Figure 1 shows the comparison of the 90% CL exclusion curves in the (sin 2 2ϑ ee , ∆m 2 41 ) plane obtained with the χ 2 distribution for ∆χ 2 (magenta dotted), with the PROSPECT ∆χ 2 map (red dashed), and with our Monte Carlo calculation (blue solid). One can see that our method yields an exclusion curve that is close to the PROSPECT one and can be considered as an acceptable approximation, especially taking into account the large improvement with respect to the exclusion curve obtained assuming the χ 2 distribution.
Having established the approximate reliability of our method, we present in Figure 2 the contours of the allowed regions in the (sin 2 2ϑ ee , ∆m 2 41 ) plane at different confidence levels of the four reactor spectral-ratio experiments that we take into account: NEOS [12], DANSS [14], Bugey-3 [26], and PROSPECT [27]. The figures show also the marginal ∆χ 2 's for sin 2 2ϑ ee and ∆m 2 41 and the values of the marginal ∆χ 2 's corresponding to different confidence levels obtained with the χ 2 distribution (dashed) and our Monte Carlo (solid). Note that the marginal ∆χ 2 's corresponding to different confidence levels obtained with the χ 2 distribution have fixed values given by

FIG. 2.
Contours of the allowed regions in the (sin 2 2ϑee, ∆m 2 41 ) plane at different confidence levels for the four reactor spectralratio experiments (a) NEOS [12], (b) DANSS [14], (c) Bugey-3 [26], and (d) PROSPECT [27]. The solid lines represent the contours obtained with our Monte Carlo evaluation of the distribution of ∆χ 2 , and the dashed lines depict the contours obtained assuming the χ 2 distribution. Also shown are the marginal ∆χ 2 's for sin 2 2ϑee and ∆m 2 41 , together with the ∆χ 2 values corresponding to different confidence levels obtained with the χ 2 distribution (dashed) and our Monte Carlo (solid). The best-fit points are indicated by crosses.
freedom, whereas the Monte Carlo ones depend on the value of the unmarginalized parameter.
From Figure 2 one can see that in general the difference between the Monte Carlo contours and those obtained with the χ 2 distribution is larger for small values of sin 2 2ϑ ee , where the χ 2 distribution leads to undercoverage, as discussed in Ref. [22]. This is due to the larger probability of fitting the statistical fluctuations of the data with fake oscillations when the signal is small or absent. Correspondingly, the Monte Carlo values of  [12], DANSS [14], Bugey-3 [26], PROSPECT [27], and to the combined fit.
the marginal ∆χ 2 sin 2 2ϑee 's associated with different confidence levels increase in a significant way for decreasing sin 2 2ϑ ee . On the other hand, the Monte Carlo values of the marginal ∆χ 2 ∆m 2 41 's corresponding to different confidence levels are approximately flat, with slightly smaller values in the range of ∆m 2 41 where the experiment is mostly sensitive. However, they are significantly larger than those obtained with the χ 2 distribution, because for each value of ∆m 2 41 the contribution to the marginalization of small values of sin 2 2ϑ ee lead to undercoverage in the case of the χ 2 distribution. In the range of ∆m 2 41 where the experiment is mostly sensitive, the undercoverage is slightly smaller, because it occurs only for very small values of sin 2 2ϑ ee .
The Monte Carlo evaluation of the distribution of the ∆χ 2 allows us also to estimate reliably the statistical significance of the indication in favor of active-sterile neutrino oscillations in each experiment. This is done by considering the difference ∆χ 2 NO = χ 2 NO − χ 2 min , where χ 2 NO is the value of the χ 2 obtained without neutrino oscillations. If the conditions for Wilks' theorem [19] hold, ∆χ 2 NO has a χ 2 distribution with two degrees of freedom corresponding to the two fitted parameters sin 2 2ϑ ee and ∆m 2 41 . Table I shows the values of ∆χ 2 NO that we obtained in the analyses of the data of the four reactor spectral-ratio experiments NEOS [12], DANSS [14], Bugey-3 [26], and PROSPECT [27]. Also shown are the p-value of the no-oscillation hypothesis and the corresponding number of σ's for a two-sided Gaussian deviation obtained with the χ 2 distribution and the Monte Carlo distribution. The statistical significance of the indication in favor of active-sterile neutrino oscillations can be quantified by the smallness of the p-value or by the number of σ's. One can see that for all the considered experiments this statistical significance is reduced considering the Monte Carlo distribution of ∆χ 2 NO with respect to the χ 2 distribution.
The NEOS indication of neutrino oscillations is strongly reduced from 2.0σ to 1.3σ using the Monte Carlo distribution instead of the χ 2 distribution. Note that our Monte Carlo 20% p-value is in very good approximate agreement with the 22% p-value estimated by the NEOS

FIG. 3.
Contours of the allowed regions in the (sin 2 2ϑee, ∆m 2 41 ) plane at different confidence levels obtained with the combined analysis of the data of the four reactor spectral-ratio experiments NEOS [12], DANSS [14], Bugey-3 [26], and PROSPECT [27]. The solid lines represent the contours obtained with our Monte Carlo evaluation of the distribution of ∆χ 2 , and the dashed lines depict the contours obtained assuming the χ 2 distribution. Also shown are the marginal ∆χ 2 's for sin 2 2ϑee and ∆m 2 41 , together with the ∆χ 2 values corresponding to different confidence levels obtained with the χ 2 distribution (dashed) and our Monte Carlo (solid). The cross indicates the best-fit point.
collaboration [12]. This is another confirmation of the validity of our method.
Using the Monte Carlo distribution instead of the χ 2 distribution, the significance of the DANSS indication of neutrino oscillations is only slightly reduced from 2.2σ to 2.0σ.
From Table I one can also see that the Bugey-3 experiment has clearly no indication of neutrino oscillations, independently of the assumed distribution of ∆χ 2 NO . Indeed, although the best-fit point in Figure 2(c) corresponds to large oscillations, all the confidence level contours are exclusion curves and the values of the marginal ∆χ 2 's are very small.
The PROSPECT χ 2 has a minimum for large oscillations, as shown by Figure 2(d), but the value of the ∆χ 2 is small and in agreement with the absence of oscillations. The Monte Carlo calculation of the distribution of the ∆χ 2 gives a statistical significance for oscillations that is only 0.9σ, and significantly smaller than the 1.5σ obtained with the χ 2 distribution. Our Monte Carlo pvalue of 37% is not as large as the 58% obtained by the PROSPECT collaboration [27], but it is much closer to it than the 14% obtained with the χ 2 distribution.
We finally performed the combined fit of the data of all the four reactor spectral-ratio experiments NEOS [12], DANSS [14], Bugey-3 [26], and PROSPECT [27]. Figure 3 shows the contours of the allowed regions in the (sin 2 2ϑ ee , ∆m 2 41 ) plane at different confidence levels and the marginal ∆χ 2 's for sin 2 2ϑ ee and ∆m 2 41 . As in Figure 2, the dashed lines correspond to the χ 2 distribution, whereas the solid lines have been obtained with our Monte Carlo. The combined fit yields the best-fit point at sin 2 2ϑ ee = 0.026 and ∆m 2 41 = 1.3 eV 2 with χ 2 min = 183.6 and 197 degrees of freedom. Therefore, the fit is very good, either considering a χ 2 distribution for χ 2 min (75% goodness of fit) or with a Monte Carlo estimation (89% goodness of fit). The best fit corresponds to the allowed regions of NEOS and DANSS at ∆m 2 41 ≈ 1.3 eV 2 , that can be seen in Figures 2(a) and 2(b). Figure 3 shows that the Monte Carlo allowed regions are much larger than those obtained with the χ 2 distribution and extend to smaller values of the mixing. This means that the preference for neutrino oscillations is smaller. Indeed, from Table I one can see that the significance of the indication of neutrino oscillations is reduced from 2.4σ to 1.8σ. Correspondingly, in Figure 3 there are several closed contours that delimit regions allowed at 90% CL, that corresponds to about 1.64σ.
In conclusion, our Monte Carlo calculations show that the significance of the indication of neutrino oscillations given by the combined analysis of the reactor spectralratio measurements is not as large as thought before [15][16][17][18] using a χ 2 distribution for ∆χ 2 . Nevertheless, there is still an intriguing 1.8σ indication in favor of activesterile neutrino oscillations that will hopefully be resolved by new data of the ongoing DANSS [13,14], PROSPECT [27], STEREO [28,29], Neutrino-4 [30,31], and SoLid [32] reactor spectral-ratio experiments.