Searching for eV-scale sterile neutrinos with eight years of atmospheric neutrinos at the IceCube neutrino telescope

We report in detail on searches for eV-scale sterile neutrinos, in the context of a 3+1 model, using eight years of data from the IceCube neutrino telescope. By analyzing the reconstructed energies and zenith angles of 305,735 atmospheric $\nu_\mu$ and $\bar{\nu}_\mu$ events we construct confidence intervals in two analysis spaces: $\sin^2 (2\theta_{24})$ vs. $\Delta m^2_{41}$ under the conservative assumption $\theta_{34}=0$; and $\sin^2(2\theta_{24})$ vs. $\sin^2 (2\theta_{34})$ given sufficiently large $\Delta m^2_{41}$ that fast oscillation features are unresolvable. Detailed discussions of the event selection, systematic uncertainties, and fitting procedures are presented. No strong evidence for sterile neutrinos is found, and the best-fit likelihood is consistent with the no sterile neutrino hypothesis with a p-value of 8\% in the first analysis space and 19\% in the second.

Anomalies in short-baseline oscillation experiments studying neutrinos from pion decay-at-rest [1], meson decay-in-flight beams [2], and nuclear reactors [3] have produced a string of experimental observations that suggest unexpected neutrino flavor transformation at short baselines. These observations are anomalies under the well-established three massive neutrino framework, but can be accommodated, to some extent, by addition of a new heavy neutrino mass state ν 4 . For consistency with constraints from invisible Z-boson decay [4] and existing unitarity constraints on the three Standard Model neutrinos [5], the new state is mostly composed of a sterile flavor ν s which does not participate in Standard Model electroweak interactions. The sterile neutrino hypothesis is the minimal explanation of the anomalous observations. It has motivated a worldwide program to search for new particle states with mass-squared differences between 0.1 eV 2 and 10 eV 2 [6]. Notable other explanations include, for example, phenomenology that modifies the vacuum oscillation probability relevant to short-baseline neutrino experiments , modifications of neutrino propagation in matter [31][32][33][34][35], or production of new particles in the beam or in the detector and its surroundings [36][37][38][39][40][41][42][43][44][45][46][47][48][49].
The simplest sterile neutrino model, called the "3+1" model, introduces a single mass eigenstate ν 4 that is heavier than the three flavor states mostly composed of active neutrinos (ν 1 , ν 2 , ν 3 ) by a fixed difference ∆m 41 = m 4 -m 1 . Three flavor neutrino mixing is described by the well-known 3 × 3 Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix [50,51]. In the 3+1 model, an extended 4×4 PMNS matrix U 4×4 includes additional elements U e4 , U µ4 , and U τ 4 to account for the heavy neutrino fraction of the three active flavors. This extended PMNS matrix can be parameterized as where U P M N S is block diagonal between the first three and the forth component, and the new matrix elements are expressed in terms of three mixing angles θ 14 , θ 24 , θ 34 , and two observable CP -violating phases, δ 14 , δ 24 [52]. The 3+1 model is widely used as a benchmark for experimental datasets to examine whether they show evidence for a sterile neutrino. Extensions to this model have been proposed such as adding more neutrino mass states [52], allowing the heavier mass states to decay [27,48,49,53], or introducing secret neutrino interactions [32, 35-37, 40, 44-47, 54-57]; these more complex models are not considered further in this work.
In terrestrial experiments with low-energy neutrinos (< 100 GeV) and short-baselines (≤ 1 km), neutrino oscillations involving the heavier mass state proceed as in vacuum, parameterized by a single effective mixing angle determining the oscillation amplitude, and the value of ∆m 2 41 , which controls the oscillation wavelength. Although mixing generally depends on all of the model parameters, results of vacuum-like neutrino oscillation experiments are often presented in a two-dimensional parameter space of one mixing angle and one mass-squared difference, where the relationship between the effective mixing angle and the rotation mixing angles -θ 14 , θ 24 , and θ 34 -depends on oscillation channel.
In the 3+1 model, the disappearance and appearance oscillation probabilities are related. A consistent interpretation of all data sets requires that non-zero oscillations be observed in ν µ → ν e , ν µ → ν µ , and ν e → ν e channels. At the present time this is not the case. Flavor change consistent with the presence of an eV-scale new neutrino mass state has been observed in some ν µ → ν e appearance experiments [58,59]. A general deficit of antineutrinos observed from nuclear reactors can be interpreted as a finite ν e → ν e disappearance signature [3], although the complexities of modeling the reactor antineutrino flux normalization and shape remain controversial, with much ongoing theoretical and experimental work [60][61][62][63][64][65]. The complementary ν µ → ν µ channel has been studied by various experiments, but no anomalous flavor change has been observed [66][67][68][69][70][71][72][73].
Global fits to world data prefer the 3+1 model over a model with no sterile neutrinos by more than 5σ [52]. This is despite the fact that the apparent observation of flavor change in the ν µ → ν e channel, apparent nonobservation of flavor change in the ν µ → ν µ channel, and present knowledge of the allowed magnitude of ν e → ν e disappearance remain difficult to reconcile under the 3+1 model. Furthermore, the removal of no single data set relieves this tension to an acceptable level [74]. Continued study of the ν µ disappearance channel with increased sample size and systematically controlled experimental datasets therefore represents a critical aspect of reaching a conclusive statement about eV-scale sterile neutrinos.
One of the strongest constraints on sterile-neutrinoinduced ν µ andν µ disappearance is from the study of highenergy atmospheric neutrinos observed by IceCube [75]. The IceCube Neutrino Observatory [76] is a gigaton ice-Cherenkov detector located near the South Pole. It is comprised of 5160 digital optical modules [77], or DOMs, which are self-sufficient detection units made of photomultiplier tubes enclosed in pressure housings, deployed on 86 vertically orientated "strings" extending between 2450 m and 1450 m below the surface of the Antarctic ice sheet [78,79]. These modules detect the Cherenkov light emitted by charged particles created in high-energy neutrino interactions. Most of IceCube's detected neutrinos are produced in cosmic-ray air showers and span an energy range from approximately 10 GeV to 1 PeV, with the peak detected flux around 1 TeV [79]. High-energy muons produced in the cosmic-ray air showers can penetrate the Antarctic ice sheet and dominate the downwards-going event rate in IceCube. Therefore to select neutrino events, it is common to only look at events originating below the horizon (upward-going).
Sterile neutrinos in IceCube would give rise to a suite of oscillation effects -not only simple vacuum-like oscillations, but also matter-enhanced resonant effects induced as high-energy neutrinos cross the core of the Earth [80][81][82][83][84]. These resonances can lead to a dramatic magnification of the ν µ disappearance signature within the IceCube atmospheric neutrino sample for mass-squared differences between 0.1 eV 2 . and 10 eV 2 . For example, order-of-magnitude enhancements in the disappearance probability occur at the peak energy of 1 TeV for plausible values of ∆m 2 41 and θ 24 . Some examples are shown in Fig. 1.
IceCube has previously searched for matter-enhanced signatures of sterile neutrinos using one year of highenergy atmospheric neutrino data [86]. The one-year sample contained 20,145 upward-going muon tracks in the approximate energy range 400 GeV to 20 TeV, with a non-neutrino induced background of less than 0.01%. The study found no evidence for ν µ disappearance and placed a constraint in an unexplored area of the sin 2 (2θ 24 )-∆m 2 41 parameter space. The resulting upper limit on θ 24 was constructed assuming that all other heavy-neutrino related mixing angles are zero. The value of θ 14 does not affect the upper limit, while the choice of θ 34 = 0 yields a conservative upper limit on sin 2 (2θ 24 ) [87] when θ 34 < 25 • [88]. The statistical uncertainties in that analysis were at or below 5% per bin, mandating a comparable level of control of systematic uncertainties in the Antarctic ice model, atmospheric neutrino flux, and detector response.
For values of |∆m 2 41 L/E| 1, sterile neutrino oscillations are rapid in energy and become unresolvable within detector resolution of approximately log 10 (E/1 TeV) ≈ 0.3 [89], leading instead to a deficit that is approximately independent of ∆m 2 41 , L and E. In this regime, the presence of flavor-violating mixing makes it possible to search for signatures of sterile neutrinos either as modification of standard neutrino oscillations [90] or anomalous flavor conversion [91], proportional to the matter density traversed. IceCube has previously searched for this effect using its low-energy dataset, examining 5,118 total events collected over three years in the energy range of 6.3 GeV FIG. 1. Disappearance probability calculated using NuSQuIDS [85] for several sterile neutrino parameters. Thē νµ disappearance probability in terms of true neutrino energy and cosine of the zenith is shown for several sterile neutrino parameters. Top row has fixed sin 2 (2θ24) = 0.1 and increasing mass-squared differences from left to right. Bottom row has fixed ∆m 2 41 = 1 eV 2 and increasing mixing from left to right. There are visible discontinuities between inner to outer core (cos(θ true ν ) = −0.98) and outer core to mantle (cos(θ true z ) = −0.83). Note that the peak of the IceCube flux is at around 1 TeV. to 56 GeV [92]. Because of their low energies, event reconstruction is more challenging in this sample and backgrounds are difficult to reduce. To mitigate background contamination, the DeepCore sub-array [79,93] was used for event selection and reconstruction with the remainder of the IceCube array serving as a veto against atmospheric muon backgrounds. No evidence of atmospheric ν µ disappearance was observed, leading to a limit expressed in terms of the mixing matrix elements |U µ4 | 2 = sin 2 θ 24 and |U τ 4 | 2 = sin 2 θ 34 cos 2 θ 34 .
This article presents an update of the search for sterile neutrinos with IceCube in both resonant ("Analysis I") and large-∆m 2 41 ("Analysis II") scenarios using the IceCube 8-year high-energy neutrino dataset. The data comprise of 305,735 ν µ events with reconstructed energies between 500 GeV and 10 TeV. Relative to earlier analyses, we use an improved high-efficiency event selection and significantly updated detector model and calibration. The increased sample size over IceCube's previously published event selection [94] has mandated a substantial overhaul of the systematic uncertainty treatment related to the glacial ice, detector response, incident neutrino flux, and neutrino interactions in order to achieve systematic control at the few-percent level. This paper aims to provide a comprehensive explanation of the search for sterile neu-trinos with the IceCube 8-year high-energy neutrino data set presented in Ref. [95]. Further information can be found in Refs. [96][97][98].

II. ANALYSIS OVERVIEW
The main results presented in this paper are two independent sets of frequentist confidence intervals applied in distinct analysis sub-spaces, which we will refer to as Analysis I and Analysis II. The two analysis spaces are constructed such that the only parameter point shared by both is the "no sterile neutrinos" hypothesis. For Analysis I, a Bayesian model comparison is also constructed, as reported in Ref. [95]. This paper provides detailed information on both frequentist and Bayesian analysis techniques and results. These two statistical approaches aim to answer different, though often related, questions. The Bayesian approach informs us about the likelihood of the model given the observed data and has the advantage of a unified interpretation of systematic and statistical uncertainties, whereas the frequentist approach allows us to construct intervals that encompass the regions of parameters that best match the data, enabling direct comparison with other published confidence intervals.
The results of Analysis I are given in terms of ∆m 2 41 and sin 2 (2θ 24 ), with |U τ 4 | 2 (or equivalently, θ 34 ) set to zero. As with the previous IceCube sterile neutrino search, this choice is conservative since θ 34 = 0 minimizes sensitivity to the effects of non-zero θ 24 [84,87]. Neither analysis has sensitivity to |U e4 | 2 so it is set to zero throughout; similarly the new CP -phases are set to zero as they affect the results only marginally. Analysis II applies to the regime where sterile-neutrino-driven oscillations are fully averaged within the energy resolution of the detector, which is the case for ∆m 2 41 20 eV 2 given our energy range and resolution. For the purpose of calculation we have fixed ∆m 2 41 = 50 eV 2 . The results of Analysis II are intervals in the two rotation angles sin 2 (2θ 34 ) and sin 2 (2θ 24 ).
The expected at-detector neutrino flux is calculated at each hypothesis point in the physics parameter space. This involves simulating the neutrino oscillations and absorption across the Earth; determining the interaction point in the ice or rock; producing and propagating finalstate particles; modeling the detector response; emulating the online triggering and at-Pole event selection; performing event reconstruction; and applying the high-level event selection. The signal expectations at each parameter point can then be used to generate pseudo-experiments for construction of frequentist intervals, or to compute the Bayesian evidence when constructing the Bayesian hypothesis test.
Events are selected using a new high-efficiency and high-purity event selection described in detail in Sec. IV. All data in the sample have been reprocessed with the most up-to-date IceCube calibration protocols described in Ref. [99] and only include events where all 86 strings of the IceCube array were fully functional. The total data set contains 305,735 events collected over a livetime of 7.634 years, starting on May 13 th , 2011 and ending on May 19 th , 2019. The energy proxy and directional reconstructions are calculated using the latest versions of internal IceCube event reconstruction software packages, similar to those used in Ref. [94]. The expected angular resolution σ cos θz varies between 0.005 and 0.015 as a function of energy, and the energy resolution is approximately σ log 10 (Eµ) ∼ 0.5 [100]. The data are divided into 260 bins in reconstructed muon energy and the cosine of the zenith angle, cos(θ z ). The reconstructed energy is logarithmically binned in steps of 0.10, from 500 GeV to 9976 GeV (13 bins). The cos(θ z ) is binned linearly in steps of 0.05, from −1.0 to 0.0 (20 bins).
We perform frequentist parameter estimation using a maximum-likelihood approach. In this work, the likelihood function, which describes the probability of the observed data given a specified physics model, is defined as: where x i is the number of observed events in the bin; µ i ( Θ, η) and σ i ( Θ, η) are the expected number of events and its corresponding Monte Carlo (MC) statistical uncertainty in the same bin; and Θ and η are the set of physics and systematic nuisance parameters respectively. The bin-wise likelihood function L eff is a modified version of the Poisson likelihood that accounts for Monte Carlo statistical uncertainties, first introduced in Ref. [101]. Using this protocol, the effects of finite Monte Carlo statistics on the analysis results become negligible.
For the frequentist analysis, we use the profile likelikehood technique to account for systematic uncertainties. The profile likelihood is defined as the constrained optimization of the likelihood, where the constraints from external information on the nuisance parameters are encoded in the function: The penalty terms for each nuisance parameter are Gaussian functions with central values and standard deviations given in Table. IV. The minimization is performed with the limited-memory BFGS-B algorithm [102], which is aware of box constraints on the parameters. In order to construct confidence regions for our parameters of interest we use the following test statistic: whereˆ Θ is the best-fit point that maximizes L profile . We construct frequentist confidence regions using the Neyman construction [103] with the Feldman-Cousins ordering scheme [104]. Based on validations at several points in the parameter space with Monte Carlo ensembles we find that the test-statistic distribution (Eq. 6) follows Wilks' theorem faithfully [105], so we use this to draw the final contours.
We have also performed a Bayesian model selection analysis reported in Ref. [95]. In order to avoid dependence on the physics parameter priors we compare each physics parameter point to the no sterile neutrino hypothesis. We compute the Bayesian evidence E at each parameter point η. The evidence of a model with prior Π Θ ( Θ ) = δ( Θ − Θ ) is given by [106]: where the priors on the nuisance parameters are the constraints used in the frequentist analysis given in Table. IV and the integral is evaluated using the MultiNest algorithm [107]. The ratio between the evidence for a given model parameter point and for the null hypothesis is known as the Bayes factor, and quantifies the preference for the alternative model over the null. Following usual practices, whenever appropriate, we assign a qualitative statement to the model comparison using Jeffreys scale [108]. In this scale, strong preference for the alternative model over the null is stated when there is 95% certainty of the alternative when both hypothesis have a priori equal likelihoods.

III. SIGNAL PREDICTION
A critical aspect of this analysis is calculation of the expected detected muon distribution for each physics parameter point. We now describe the method for computing the "central" Monte Carlo model, i.e., with no systematic variations applied. This involves prediction of the atmospheric neutrino flux, calculation of expected oscillated flux at IceCube, creation of a weighted ensemble of final-state particles, propagation of these particles through the detector model, and event reconstruction. The first two of these steps will be performed for each point in the physics parameter space, which is projected onto a grid defined as: • ∆m 2 41 from 0.01 eV 2 to 100 eV 2 logarithmically in steps of 0.05 (80 bins); • sin 2 (2θ 24 ) from 0.001 to 1.0 logarithmically in steps of 0.05 (60 bins); • sin 2 (2θ 34 ) from 10 −2.2 to 1.0 logarithmically in steps of 0.05 (44 bins).

A. Atmospheric and astrophysical neutrino flux predictions
The neutrino flux is assumed to be composed of atmospheric and astrophysical neutrinos. The atmospheric neutrino flux is divided into two components: the conventional flux produced by the decay of pions, kaons, and muons; and the "prompt" flux produced by the decay of charmed hadrons. The astrophysical neutrino component, first observed by IceCube [109], is still of unknown origin. Its angular and energy distribution are compatible with an isotropic arrival directions and a power-law spectrum [110].
The conventional component is computed using the Matrix Cascade Equation (MCEq) package [111,112]. MCEq solves the atmospheric shower cascade equations numerically, and takes as inputs the cosmic-ray model, hadronic interaction model, and atmospheric density profile, which are scanned here as continuous nuisance parameters. For our central flux we use the Hillas-Gaisser 2012 H3a [113] primary cosmic-ray model. The hadronic interactions involved in the development of the extensive air showers are modeled using the Sibyll2.3c [114] model. The atmospheric density profile, required to predict the matter density through which the air showers will develop, is extracted from AIRS satellite data [115]; further details are provided in Sec. V D 4. The month-by-month temperature profiles for each year are approximated using the 2011 temperature profile. Using 2011 AIRS data, we compute the atmospheric flux for each month to account for seasonal variations and then construct a weighted sum over monthly livetime of the multi-year data sample. To ensure that the effects of annual variability of systematic climate change were not so large as to invalidate this approximation, a simulation set was generated assuming an especially hot year with two Septembers and no January. The observed change in time-integrated neutrino flux was found to be comfortably within neutrino flux systematic uncertainties.
Since re-interaction is not competitive with decay for the prompt atmospheric neutrino flux component, it is unaffected by the atmospheric density variations and is approximately isotropic. The prompt flux normalization, however, is not well known, carrying uncertainties arising from the charm quark mass and lack of hadronic data in the very forward direction from collider experiments. In this analysis, we set the prompt ν µ spectrum to the BERSS [116] calculation. This prompt flux is sufficiently sub-leading within our energy range that we do not include independent nuisance parameters to characterize its uncertainty, but rather allow any discrepancy with the central model to be absorbed by the nuisance parameters associated with the astrophysical flux.
The astrophysical neutrino flux is assumed to be isotropic, following an unbroken single power-law energy spectrum, with a central spectral index of γ = −2.5 and normalization obtained from astrophysical neutrino measurements performed by IceCube in various channels and energy ranges [110,117]. We also assume an astrophysical neutrino to antineutrino ratio of 1:1 and uniform distribution over flavors. Systematic uncertainties on the atmospheric and astrophysical fluxes are described in detail in Sec. V.

B. Oscillation prediction
Neutrino oscillation probabilities at IceCube are functions of energy and zenith angle, with the latter affecting both the distance of travel and the matter density profile traversed. The oscillation probability of high-energy atmospheric neutrinos crossing the Earth is non-trivial to calculate for several reasons: 1) the oscillation is significantly influenced by matter effects, especially in the vicinity of resonances; 2) the matter density and composition varies as a function of position in the Earth; 3) all four neutrino states participate in the oscillation; 4) absorption competes with oscillation, implying that the evolution cannot be exactly described by Schrödinger's equation. For these reasons, the neutrino oscillation probability is not solvable analytically. Instead, it is calculated numerically using the nuSQuIDS software package [85].
The nuSQuIDS package is built using the Simple Quantum Integro-Differential equation Solver (SQuIDS) frame-work [118]. The neutrino flavor density matrix is decomposed in terms of SU (N ) generators plus the identity, and an open-system Liouville-Von-Neumann equation is solved numerically, seeded with the initial atmospheric neutrino flux at a height of 20 km above Earth. The terms included in the evolution include effects deriving from neutrino mass (vacuum effects), matter effects on oscillations, and absorption due to charged-and neutral-current interactions in the Earth. Neutrino and antineutrino fluxes are propagated through the Earth in all four flavors. Additional subleading effects are also included in nuSQuIDS, including the production of secondary neutrinos in charged-current ν τ interactions, a process known as τ -regeneration [119]. For sterile neutrinos in the mass range of interest, both vacuum-like and resonant oscillation effects are generally observed; see [120][121][122][123][124][125] for an extended discussion.
Our oscillation predictions consistently include the three-flavor active neutrino oscillations, with neutrino mixing parameters set to the current global best-fit values for normal ordering given in Ref. [126]. In practice both analyses are insensitive to all active neutrino mixing parameters and mass differences, since for E ν > 100 GeV the active neutrino oscillation probability is insignificant over the Earth diameter. Fig. 1 shows some example oscillograms that illustrate the nuSQuIDS predictions, expressed as transition probabilities between the initial and final flux, Φ initial and Φ final , namely as Neutrino absorption in the Earth is computed by nuSQuIDS using neutrino-nucleon isoscalar deep-inelastic cross section calculation given in Ref. [127]. The Earth density is assumed to be spherically symmetric with density profile given by the preliminary reference Earth model (PREM) [128]. Past versions of this analysis associated a systematic uncertainty with the density profile of the Earth. However, this was found to be sub-dominant to other sources of systematic uncertainty and to the per-bin statistical precision, so the effect is no longer included here.
After propagation to the vicinity of the IceCube detector, the final-state density matrix is projected into flavor space to yield the energy-and zenith-dependent flux of each neutrino flavor. This information is used along with the doubly-differential deep-inelastic neutrino scattering cross sections to weight pre-generated Monte-Carlo events.

C. Neutrino interaction cross section
In the energy range of this analysis the only relevant neutrino interaction is neutrino-nucleon deep-inelastic scattering [129]. We use the calculation reported in Ref. [127] for neutrinos and antineutrinos. The neutrino cross section is used both in calculating the Earth opacity to high-energy neutrinos and in determining the interaction rate. The uncertainties in the cross section reported in Ref. [127] imply that the latter effect is negligible with respect to the uncertainty in the atmospheric neutrino fluxes. The effect of the cross section uncertainty on the Earth opacity has been recently discussed in Ref. [130] and is small when considering the effect on the total rate. However, since we are now searching for 1%-level distortions of the angular distribution, we incorporate an uncertainty contribution for the Earth opacity. This is implemented by computing the spectrum-dependent absorption strength for each neutrino flux component, namely atmospheric conventional, prompt, and astrophysical, given re-scaled cross sections, given the uncertainties reported in Ref. [127]. The resulting absorption distributions are used to generate a continuous parameterization of the systematic uncertainty due to Earth opacity using the PHOTOSPLINE [131] interpolation package.

D. Detector simulation
Monte Carlo samples are constructed and employed in fits using a final-state reweighting technique. Events are generated using a reference flux and propagated through the standard IceCube Monte Carlo simulation chain, to be re-weighted to a physical flux for analysis. Cherenkov light is simulated directly through layered IceCube ice models which include the effects of absorption and scattering of light on dust and other impurities, and the response of the IceCube DOMs is simulated using the techniques described in Ref. [76]. The optical effects of refrozen ice immediately surrounding the IceCube strings are parameterized, as is the optical anisotropy of the ice, and the tilt (non-planarity) of the glacier [132].
Secondary particles are injected into the target volume encompassing the detector according to a reference energy spectrum and a continuous doubly differential cross section. We consider a range of injected primary ν µ energies from 100 GeV to 1 PeV from zenith angle 80 • (10 • above the horizon) to 180 • (upward-going). The injected energies are sampled using an E −2 power-law energy spectrum and the arrival directions are distributed isotropically in azimuth and cos(θ z ). The interaction is assigned by randomly selecting a point within a cylindrical volume centered on IceCube, whose axis is aligned with the trajectory of the incoming particle, with an injection radius of 800 m. The cylinder length is set to be the 99.9% muon range in ice plus two additional "endcaps", each with a length of 1200 m. This procedure allows for efficient generation of representative Monte Carlo samples of ν µ interactions that deposit light in the IceCube detector.
For each event, the incident neutrino energy, final-state lepton energy and zenith, Bjorken x and y interaction variables, probability of the neutrino interaction, and the properties associated with the injected point (total column depth and impact parameter) are recorded. A full simulation set in this analysis contains 2 × 10 9 such events each generated with independent seeds, yielding a number of events approximately equal to 500 years of detector data. For each oscillation hypothesis and systematic uncertainty parameter configuration the event ensembles are re-weighted according to the final-state prediction for that model. The charged final-state secondaries are propagated through the ice according to the expected ionization energy loss and stochastic losses [133], accounting for ionization, Bremsstrahlung, photonuclear processes, electron pair production, Landau-Pomeranchuk-Migdal and Ter-Mikaelian effects, and Molière scattering using the PROPOSAL package [133]. Along the track, photons are generated randomly according to the parameterization of the Cherenkov radiative emission from tracks (muons) or cascades (electromagnetic or hadronic showers). Each photon is tracked as it propagates through ice until it is either absorbed or interacts with a DOM. The photon propagation accounts for random scatters according to the ice model, described as depth-and wavelength-dependent scattering and absorption [134][135][136][137], and anisotropy [132] along a major and minor axis. At each photon scattering point, the algorithm randomizes the new photon direction based on a scattering angle distribution parameterized by the mean scattering angle and scattering coefficient. For each photon that strikes a DOM in the simulation, the detector response is modeled according to standard IceCube methods, which are outlined in Ref. [79]. Simulated events are reconstructed using the same algorithms that are applied to real events, in the same manner as the previous generation of IceCube sterile neutrino searches [75,94].

IV. EVENT SELECTION
Muons are identifiable in IceCube by the track-like nature of emitted Cherenkov light as they propagate through the ice. The event selection defines the set of criteria used to reduce the background event contamination while maintaining a high-efficiency selection of atmospheric ν µ events. The primary background contributions comprise air-shower cosmic-ray (sometimes referred to herein as "atmospheric") muons, neutral-current neutrino interactrions, charged-current electron neutrino interactions, and charged-current tau neutrino interactions. The event selection described in this section identifies 305,735 events, shown distributed in reconstructed energy and cos(θ z ) in Fig. 2. The energy and zenith distributions of the data are shown separately in Fig. 3.
Despite the 1.5 km of overburden directly above Ice-Cube, the detector is triggered at a rate of approximately 3 kHz [138] by downward-going muons produced in cosmicray air showers. The simulation of cosmic-ray air showers is handled by the CORSIKA Monte Carlo package [139,140]. Eight independent CORSIKA simulation sets containing 6 × 10 8 events are used to quantify the amount of cosmicray muon contamination in the event selection, covering a primary cosmic-ray energy from 6 × 10 2 GeV to 1 × 10 11 GeV. CORSIKA simulates the air showers to ground level, propagating the cosmic-ray muons through the firn and ice to a sampling surface around the detector. The cosmic-ray muons are then weighted to an initial cosmic-ray flux, in this case HillasGaisser2012 H3a [113].
At the Earth's surface, the conventional ν µ flux dominates the neutrino flavor composition. The sub-dominant electron and tau neutrino flavors represent a far lower fraction of the background than the cosmic-ray muons. The topological signature of cascades, primarily caused by electron-neutrino and neutral-current neutrino interactions, is sufficiently different from the track-like topology that they are efficiently rejected. Tau neutrinos can interact via charged-current interactions producing a tau lepton and a cascade-like shower. When the tau lepton decays producing hadrons these events are also efficiently rejected. However, the tau lepton can subsequently decay to a muon and flavor conserving neutrinos with a branching ratio of 17.39 ± 0.04% [126]. While the signature of these events are track-like in nature, the ν τ -appearance probability from standard oscillations is small at TeV energies considering the first ν µ → ν τ oscillation maximum occurs at approximately 25 GeV for upward-going neutrinos. The electron and tau neutrino backgrounds are accounted for in dedicated simulation sets, each with an effective livetime of approximately 250 years.
The event selection for this analysis is the union of two event filters, referred to hereafter as the Golden and the Diamond filters (Sec. IV B and IV C). If an event passes either one of these filters, it is included in our final sample. For both filters, we first require that every event passes the online IceCube muon filter, which selects track-like events. We then pass the event through a series of precuts used to reduce the data and MC to a manageable level (Sec. IV A). An energy cut is placed at 500 GeV reconstructed energy, since events below this energy are found to contribute minimally to the sensitivity of the analysis, and may be subject to additional low-energy systematic uncertainties. We also place a cut to select upward-going events with i.e. cos(θ z ) ≤ 0, above which the sample is most likely to have atmospheric muon background contamination, also with minimal impact on sensitivity. Fig. 4  of the expected event rate distributions in reconstructed energy and reconstructed cosine zenith for the different event types generated using MC events passing the event selection. We show the predicted true neutrino energy distribution of the conventional atmospheric neutrinos in the sample in Fig. 5. We find that greater than 90% of our events originate from a neutrino with a true energy between 200 GeV and 10 TeV. The observed zenith angle can be taken as the true zenith angle, θ reco z = θ z , for practical purposes, since within our angular bins the difference in zenith angle between the reconstructed muon track and the MC truth is negligible (< 1 • , discussed in more detail in Ref. [141]).

A. Precuts and low-Level reconstruction
Before applying high-level event selections, a series of precuts are applied to reduce data volume and reject low-quality event candidates. These precuts are: 1. If the reconstructed direction is above the horizon, cos(θ z ) ≥ 0.0, require that the total event charge (Qtot) is greater than 100 photoelectrons (PE) and the Average Charge Weighted Distance (AQWD) is less than 200 m/PE. The AQWD is defined as the average distance of the pulses produced by Cherenkov light in each PMT, weighted by the  total charge of the event from the track hypothesis.

Component Full Sample Composition
2. Reject all events with a reconstructed zenith angle with cos(θ z ) ≥ 0.2. The vast majority of these are muons produced in atmospheric showers.
3. Require at least 15 triggered DOMs per event, and ≥ 6 DOMs triggered on direct light. Direct light refers to the Cherenkov photons which arrive at the DOMs without significant scattering, identified via event timing.
4. The reconstructed track length using direct light (DirL) in the detector must be greater than 200 m (DirL ≥ 200 m), and the absolute value of the smoothness factor (DirS) must be smaller than 0.6 (|DirS| ≤ 0.6). For well-reconstructed events, direct hits should be smoothly distributed along the track. The DirS variable is a measure of this [142]. The smoothness factor is a measure that defines how uniform the distribution of triggered DOMs is around the reconstructed track.
For every event that passes the precuts, we apply the following reconstruction methodology: 1. The event passes through an event splitter to separate coincident events into multiple independent sub-events. A coincident event is defined as an event in which a uncorrelated cosmic-ray muon entered the detector during the readout. We allow a maximum deviation of 800 ns from the speed of light travel time in which a pair of hits is to be considered correlated. Approximately 10% of neutrino events have an accompanying coincident muon in the time window.
2. Reconstruct the trajectory of each sub-event iteratively, using several timing-based reconstruction algorithms. The first algorithm uses a simple leastsquares linear regression to fit the timing distribution of the first PE observed on each DOM [143]. Then, algorithms incorporating the single photoelectron and multi-photoelectron information are used to refine the fit. These use likelihood constructions to account for the Cherenkov emission profile as well as the ice scattering and absorption, initially using the first detected photon and then all detected photons, respectively. We require that each fit succeeds in order to keep the event in the sample.
3. Reject events using a likelihood ratio comparison between the unconstrained track reconstruction and one that has a prior on the reconstructed direction. The prior, defined in Ref. [100], utilizes the fact that the majority of muon tracks are from downwardgoing cosmic-ray events and are expected to come from the Southern Hemisphere.
4. Calculate a variable to quantify the uncertainty in the reconstructed trajectory [144]. This "paraboloid sigma" value encodes the uncertainty on the trajectory reconstruction based on the likelihood profile around the best-fit reconstructed track hypothesis, with a small value indicating better precision in the reconstructed trajectory. A second variable, called the "reduced Log likelihood" (RLogL), uses the best-fit likelihood value as a global measure of the success of the fit.
5. Reconstruct the event energy. Unlike the trajectory reconstructions, energy reconstruction relies heavily on the intensity of the light incident on each DOM. Given the trajectory reconstruction, an analytical approximation for the observed light distributions is used, which accounts for the geometry between the emitter and receiver, the ice absorption and scattering, and detector noise. Stochastic losses from high-energy interactions imply that there will be points along the track with bursts of comparably more intense light. This is averaged out by broadening the PDF that describes the energy loss expectation. Further information can be found in Refs. [100,145].
The total rate of both signal and background after the precuts is approximately 1280 mHz (110592 events per day). This is composed almost entirely of cosmic-ray muons.

B. The Golden Filter
The Golden filter was originally designed as the event selection for diffuse astrophysical neutrino searches [146]. It was optimized to accept high-energy ν µ and was subsequently used in the one-year IceCube high-energy sterile neutrino search [86]. A detailed description of the cuts can be found in Refs. [100]. In brief, following simple charge multiplicity cuts, downward-going track-like events are selected and reconstructed using algorithms of increasing complexity, with successive track quality cuts applied at each stage to reject cosmic muon backgrounds (see Supplementary Material of Ref [100]). The event selection for the one-year diffuse neutrino analysis was determined to have a greater than 99.9% ν µ purity based on simulated neutrino and cosmic-ray events. Further scrutiny of approximately 1000 events during the preparation of the present analysis revealed evidence for an approximate 1% contamination due to coincident cosmic-ray muons. All these events are reported to have an AQWD greater than 100 m/PE. In many of these events, a coincident poorly reconstructed cosmic-ray muon visibly passed through the detector simultaneously with a neutrino event. Supplementary cuts were added to the event selection of the 1-year sterile neutrino analysis [86] to remove these events. It was verified that the impact of this contamination on the final analysis result was negligible.
All events passing the Golden Filter are included in our sample, in addition to extra ones recovered using a new, higher-efficiency filter described in Sec. IV C. The event counts predicted to pass for each sample shown in Table. II, and for the union in Table. I.

C. The Diamond Filter
The Diamond filter represents a new event selection introduced in this work, targeted at improving detection efficiency while maintaining high sample purity. The Diamond filter begins with a second data reduction step beyond the precuts defined above: 1. The total charge of the event must be greater than 20 PE outside of DeepCore (Qtot > 20 PE).
4. The reconstructed trajectory cannot extend significantly above the horizon (cos(θ z ) < 0.05). Cuts on variables used to reduce the atmospheric shower background. The νµ signal is shown in orange, while the backgrounds are shown in blue, teal, and green representing expectations from cosmic-ray muons, electron neutrinos, and tau neutrinos respectively. The vertical-dashed line in each plot shows the location of the cut, and the shaded region is rejected. The notable depth dependent structure is primarily due to the ice optical properties.
These cuts reduced the total rate to approximately 20 mHz (1728 day −1 ) and are each illustrated in Fig. 6.
The ice and rock overburden provides the greatest natural handle on the atmospheric muon contamination. Horizontal trajectories, for example, have approximately 157 kilometre water equivalent shielding between the atmosphere and the detector. Any atmospheric muons reconstructed with a trajectory originating below the horizon will likely have a poor reconstruction, quantified by a large value of paraboloid sigma. A two-dimensional cosmic-ray muon cut leverages this principle, shown in Fig. 8. At small overburdens, for events near the horizon, we require a smaller uncertainty in the track reconstruction, namely smaller values of paraboloid sigma. A Bayesian likelihood ratio (BayesLLHR), formed by comparing the reconstruction likelihood with a prior favoring downwardgoing arrival directions compared to the reconstruction likelihood without this prior, was introduced in Ref. [147] specifically to reduce the cosmic-ray muon backgrounds. We include a cut on the Bayesian likelihood ratio as a function of overburden.
A series of straight cuts were then introduced on the center of gravity of the charge in both the vertical direction (COGZ) and the radial direction (COGR). These cuts reduce the contamination by events near the edge of the detector, known as "corner-clipping" events, which have a higher probability of being misreconstructed cosmic-ray muons. We also introduce the same updated AQWD cut found in the Golden filter. Finally, we attempt to remove residual background events with some simple safety cuts. These are shown in and Fig. 9 and Fig. 10. The two-dimensional RLogL and DirNDoms cuts below are used in the Golden Filter and found to be useful without affecting neutrino data. These are given by: The resulting event count predictions after these cuts are shown in Table. I.

D. IceCube data selection
IceCube data are typically broken up into 8-hour runs, which are vetted by the collaboration as having "good" data (details can be found in Ref. [76]). For every good run, we additionally require that all 86 strings are active, as well as at least 5,000 active in-ice DOMs. This data quality condition results in less than 0.4% loss of livetime. An average of 5048 ± 4 DOMs are active in the detector throughout all seasons used in this analysis. We observe no significant deviation in the average event rate throughout the years. The seasonal data rates are shown in Table. III.   The PMT gain is known to vary with time. At the beginning of every season the DOM high voltage is adjusted accordingly to maintain a gain of 10 7 . To verify the stability of the extracted charge as a function of time, an analysis into the time variation of the single photoelectron charge distribution was performed. The components used to describe the single photoelectron charge distribution, the sum of two exponentials and a Gaussian, are found to have no systematic variation as a function of time greater than that observed by random scrambling of the years. This is reported in Ref. [148] and in agreement with the stability checks performed in Ref. [79]. Top two plots show the 1D cuts used in the clean-up step. In these plots the color coding is the same as in Fig. 7 where, again, the vertical-dashed line in each plot shows the location of the cut, and the shaded region is rejected.

V. SYSTEMATIC UNCERTAINTIES
Compared to the one-year high-energy sterile neutrino search by IceCube [86], the number of ν µ events used in this analysis is nearly a factor of 14 larger. This corresponds to a significant decrease in the bin-wise statistical uncertainty in a large portion of the reconstructed energycos(θ z ) plane. A considerable effort has been devoted to properly modeling and understanding the systematic uncertainties at the requisite 1%-per-bin level. Each uncertainty reported in this section will be described in terms of the shape it generates on the reconstructed energy-cos(θ z ) plane as it is perturbed within 1σ of its Gaussian prior. Maps of these effects as used in the analysis are provided in the Supplementary Material. The full list of nuisance parameters and their priors and boundaries are shown in Table. IV.

A. DOM efficiency
The term DOM efficiency is used to describe the effective photon detection efficiency of the full detection unit. In addition to DOM-specific effects like photocathode efficiency, collection efficiency, wavelength acceptance, etc., it also encompasses any physical property that changes the percentage of photons that deposit a measurable charge in the detector globally. This includes properties external to the DOM such as the cable shadow, hole ice properties, and some aspects of bulk ice properties.
The secondary particles in simulated neutrino-nucleon interactions are propagated through the ice with an overabundance of photons produced along their track. During the detector level simulation, photons are down-sampled, i.e. a percentage of the propagated photons are randomly destroyed to achieve the desired DOM efficiency in simulation. Events are generated at a relative DOM efficiency of 1.10, then down-sampled to the central value of 0.97, as determined by calibration measurements. Five systematically different data sets were simulated relative to the  central value at +6.3%, +4.7%, +2.4%, -1.6%, and -3.1%, which allow us to probe DOM efficiency values between approximately 0.93 and 1.03. The five discrete DOM efficiency datasets are converted into a continuous parameterization using penalized splines [149] fitted to the reconstructed energy and cos(θ z ) distributions. This procedure is also used for various systematic data sets, namely the hole ice and Earth opacity effects, and allows us to re-weight each event to any systematic value within the uncertainty range.
An example of the shape-only (mean normalization removed), effect of perturbing the DOM efficiency by ±1% relative to the central MC set is shown in the Supplementary Material, panels i.a and i.b. As expected from a change in the average observed charge, the shape is manifest primarily as a shift in reconstructed energy scale, with lower DOM efficiencies pulling mean reconstructed energy to lower values. The prior is chosen to be ±10%, which was determined independently using minimum ionizing atmospheric muons [145]. In practice, while the DOM efficiency prior is wide, the constraint imposed by the observed energy scale in data implies that DOM efficiency has a tight posterior in the analysis, and is not an especially limiting source of uncertainty.

B. Bulk ice
Bulk ice refers to the undisturbed ice between the IceCube strings, through which photons must propagate between emission and detection. The bulk ice is highly transparent, but residual impurities, commonly referred to as "dust," introduce both scattering [134] and absorption phenomena [136]. The dust concentration within IceCube accumulated from snowfall over the 100,000 year history of the ice [150,151], with a concentration that correlates with the climate history of the Earth. To model the depth dependence of optical scattering and absorption, IceCube uses a layered ice model [152,153], wherein absorption and scattering are parameterized for every 10 m layer. The layers are non-planar, to account for the buckling of the glacier as it has flowed, as measured with "dust-logger" devices [154] deployed into some of the holes before DOM deployment. The ice also demonstrates anisotropic light propagation [155], governed by the direction of glacial flow. These effects are all incorporated into an ice model calibrated to LED flasher data [153]. The ice model used in this analysis is several generations newer than the one used for the one-year sterile neutrino search, and includes anisotropy, tilt, and the present best-fit layered ice coefficients.
Assessing the uncertainty on this model is very challenging, because it depends on a large number of parameters, all constrained by common calibration data. A new method of treating IceCube's bulk ice uncertainties, called the "SnowStorm", was developed for this analysis, and has been published in Ref. [156]. In a SnowStorm Monte Carlo ensemble, every event is generated with a distinct set of nuisance parameters, drawn randomly from a multivariate Gaussian distribution. Manipulation of the ensemble by either cutting or weighting can be used to study the impact of each one of many potentially correlated uncertainties on the analysis, and construct a covariance matrix. Full mathematical details are provided in Ref. [156].
To maintain a manageable number of nuisance parame-ters for the SnowStorm method, instead of treating each ice layer coefficient as a free parameter, we select the most important ice uncertainty contributions by working in Fourier space. Perturbations to the ice model that distort the scattering or absorption parameters over detectorsize scales are expected to impact analyses, whereas very localized effects are not, after averaging over incoming neutrino directions and energies. Thus the uncertainty on the lowest Fourier modes of the continuous ice model encode the majority of the uncertainty on the layered ice. Previous analyses had only used the zeroth mode, or overall absorption and scattering scale, as an uncertainty.
Here, however, we find substantial impact on the analysis space from modes up to the fourth. The allowed mode variations and their covariances are constrained using 10 TeV 500 GeV FIG. 11. Energy distribution ice uncertainty covariance (top) and correlation matrix (bottom). The color scale shows the covariance / correlation between energy bins. flasher calibration data to yield a nuisance parameter covariance matrix. This matrix, along with nuisance parameter gradients derived using the SnowStorm method, are used to construct an energy-dependent covariance matrix in analysis space. The effect of the ice uncertainty on zenith distribution is found to be far sub-leading.
The ice covariance matrix and associated correlation matrix is shown in Fig. 11. In order to incorporate this covariance matrix into our nuisance parameter formalism, it must be decomposed in terms of a set of nuisance parameters that can vary within prescribed priors. To avoid incorporation of the full set of nine new parameters into the fit for each phase and amplitude each mode, plus a constant, we instead define two effective gradients that vary within a correlated prior to yield the same covariance matrix. Such a decomposition encodes the ice uncertainty budget from the first four modes in only two effective nuisance parameters. Variation of these parameters within a suitably correlated Gaussian prior reproduces the full covariance matrix to high precision, encoding several distinct sources of systematic uncertainty into two effective parameters. The effects of perturbing these parameters at the 1σ level are shown in the Supplementary Material, panels i.c and ii.a.

C. Hole ice
Each photon detected by a DOM must also propagate through the refrozen ice in the boreholes, known as "hole ice" [157], which were drilled to deploy the strings. Recorded images of the refreezing process [158] suggest that the hole ice has a transparent component extending from the edge of the hole inwards, and a central column of bubbles or impurities, roughly 8 to 10 cm in diameter. The primary effect of hole ice is to introduce additional optical scattering near the DOM, which effectively perturbs the angular acceptance curve relative to that measured in laboratory conditions.
An empirical parameterization has been derived from microscopic simulations of light interacting with the hole ice, depending on two free parameters p 1 and p 2 : where η is the angle of the incoming photon, as indicated in the left side of Fig. 12. The p 2 parameter primarily varies the upward-going photon acceptance (cos(η) = 1), the subset most strongly scattered by the bubble column. This parameter is referred to as the "forward hole ice" and will be included as a systematic uncertainty in this analysis. The p 1 parameter, on the other hand, is found to have a minimal impact and fixed at its default value.
Five identical sets of Monte Carlo were generated with the only difference being the description of the angular acceptance forward hole ice parameter. These sets were produced with hole ice parameters p 2 = −5, −3, −1, 1, 3 and p 1 = 0.3 and are shown in Fig. 12. Each of these curves is commonly normalized to maintain a constant overall efficiency factor. Penalized splines were generated to re-weight each event to any continuous value for p 2 between -5 and 3. The central MC set was chosen to be p 2 = −1.0 and p 1 = 0.3 and we assign a wide prior to the forward hole ice parameter, namely p 2 = ±10. The shape generated by perturbing the forward hole ice to -3 and +1 relative to the central set is shown in the Supplementary Material, panels ii.b and ii.c.

D. Atmospheric neutrino flux
Unlike the one year high-energy sterile neutrino search, we have transitioned from using discrete variants of the cosmic ray and hadronic interaction models to continuously parameterized fluxes controlled by nuisance parameters. The envelope of parameterized models is consistent with the spread of discrete models considered in the earlier analysis, with the benefit of enabling effective interpolation between them, guided by physics-motivated tunable parameters.
The uncertainty in the conventional neutrino spectrum is factorized into the uncertainty in the meson production in the atmosphere, the overall normalization, the cosmic ray spectral index, the atmospheric density, and the rate of meson energy loss in air. The following subsections give an overview of how each of these are implemented. Fig. 13 shows how different components are manifest in the total ν µ flux (HillasGaisser2012) as a function of energy and zenith angle. The plot on the left shows the upward-going flux and the plot on the right shows the flux from the horizon. In this figure, the color combination represents the neutrino flux from a given progenitor labeled at the top of the figure.
1. Hadronic uncertainties using the Barr parameterization The Barr parameterization [159] describes the uncertainty associated with the production of pions and kaons in hadronic interactions based on accelerator data. The uncertainties are estimated as a function of the incident particle energy, E i and x lab = E i /E s , where E s is the energy of the secondary total energy. They are independently calculated for positive and negative mesons. In the energy range of interest to this analysis, 100 GeV to 10 TeV, the neutrino flux is dominated by neutrinos produced from kaon decay. Using the notation of Ref. [159], the Barr parameters responsible for describing the uncertainties associated with pion production (A ± to I ± ) are found to be negligible at analysis level. We thus restrict our consideration to only those that impact the kaon production above 30 GeV: W ± , Y ± , and Z ± . The relevant phase space for each parameter in terms of x lab and primary energy are shown in the second and third column of Table. V. For full details of the Barr scheme, we refer the reader to Ref. [159]. The flux gradients are constructed by computing the flux difference close to the nominal values. Then these gradients are multiplied by the variation of the parameter corresponding parameter to obtain the effective shapes in the reconstructed energy -cos(θ z ) plane. We use the nomenclature "P" and "M" on the Barr parameters to denote whether they are used for the positively or negatively charged mesons. The effects of varying the relevant Barr gradients are shown in the Supplementary Material, rows iii and iv.

Conventional neutrino flux normalization
At large values of ∆m 2 41 there are regions in the physics parameter space with small signal shape and large normalization shifts, caused by fast energy dependent oscillation which is unresolved within detector resolution. To control against spurious fits to sterile neutrino hypotheses in these regions it is vital to include an appropriate uncertainty on the conventional neutrino flux normalization. This is was primarily derived from the theoretical uncertainty reported in Ref. [112] and an extrapolation from the uncertainties quoted in the HKKM calculation given in Ref. [160].
The theoretical uncertainty reported in Ref. [112] accounts for both the cosmic ray and hadronic interaction model in the energy range of interest for this analysis. Up to approximately 1 TeV, the hadronic interaction model represents the majority of the uncertainty since the cosmicray models in this regime are relatively well established. Above this energy, the uncertainty arising from features around the cosmic-ray knee dominates the total uncertainty. The sub-TeV uncertainty is in agreement with the calculated total uncertainty found in the HKKM calculation [160]. At 1 TeV, the uncertainty is reported as 25% and consists of the uncertainties associated with the pion and kaon production, hadronic interaction cross section, and atmospheric density profile. Based on the findings described above, we include a 40% uncertainty on the conventional atmospheric neutrino normalization. The shape and normalization exhibited when perturbing the conventional atmospheric normalization by ±1σ is shown in the Supplementary Material, panels v.a and v.b. As well as changing the normalization, this uncertainty introduces a small relative shape effect from the changing ratio of contributions from conventional, astrophysical and prompt neutrino fluxes.

Cosmic-ray spectral slope
In the energy range of interest for this analysis, the cosmic ray spectrum responsible for producing the atmospheric neutrinos follows approximately an E −2.65 energy dependence. We attribute a spectral shift, ∆γ, to the energy dependence as: where E 0 has been chosen to be 2.2 TeV in order to approximately preserve the total normalization. The measured cosmic-ray spectral index from the recent measurements is shown in Table. VI. Based on these measurements, we assign a prior width on the cosmic ray spectral shift of ∆γ = 0.03.

Atmospheric density
The pions and kaons produced in the hadronic showers induced by cosmic rays can either interact or decay, with the latter producing the conventional neutrino flux. The competition between the two processes depends on the local atmospheric density. IceCube has previously shown that the atmospheric conditions presented to the cosmic-ray flux can affect the atmospheric neutrino spectrum [165].
We ascribe an uncertainty to the atmospheric density by perturbing the Earth's atmospheric temperature within a prior range given by the NASA Atmospheric InfraRed Sounder (AIRS) satellite [115] temperature data.
The satellite provides open source atmospheric data for weather forecasting and climate science and reports the temperature profile as a function of atmospheric depth and location. Using monthly averaged temperature data arranged on a 180 × 360 grid (each element representing a 1 • × 1 • area on the surface of the Earth), we calculate the density at 24 discrete altitudes assuming the ideal gas law, from which we linearly interpolate to describe the atmospheric density profile. A random z-score is chosen and all data points are shifted according to reported systematic error on AIRS measurement. This protocol is used to ascribe uncertainty on the neutrino flux due to atmospheric density uncertainties. The 1976 United States Standard [166] atmosphere model was used as a cross-check, and falls within this envelope. The resulting atmospheric profile is injected into MCEq to generate a neutrino flux. This is performed independently for a variety of cosmic ray and hadronic interaction models: 1. The hadronic atmospheric shower model.
• Zatsepin-Sokolskaya/PAMELA [169,170] • Hillas-Gaisser/Gaisser-Honda [168,171,172] • Poly-gonato [173,174] For a given model and neutrino energy, we average over all months and longitudinal variations to determine the change in the zenith distribution associated with the temperature profile perturbation. Fig. 14 shows an example for true ν µ energy of 8.9 TeV. The standard deviation, shown as the dotted red line is computed for each zenith angle and is assigned as the atmospheric density uncertainty. Note that we force the lines to cross near cos(θ z ) = −0.7 in order to account for the 180 • temperature offset between the northern and southern hemispheres. The shape generated when perturbing the atmospheric density to ±1σ is shown in the Supplementary Material panels vi.b and vi.c. It appears primarily as a zenith dependent effect. In total, 4450 different combinations of temperature shifts (z-score perturbations), hadronic interaction models, cosmic ray models, monthly variations, and sampling longitudes are used to assess the spread attributed to the temperature uncertainty.

Kaon-nuclei total cross section
We must account for the uncertainty in the chargedmeson energy losses during the air shower development. Of the mesons responsible for the ν µ flux, we are particularly interested in the uncertainty associated with the kaon re-interaction with oxygen (O) and nitrogen (N) nuclei within the atmosphere. Uncertainties on the KO(N) total interaction cross section in principle influence the energy spectrum of emitted neutrinos, and have been investigated and parameterized.
The total cross section for K ± -nucleon has not been measured above 310 GeV [175], the lower end of our energy spectrum. From proton-proton (pp) cross section measurements, one can theoretically derive the kaon-nucleus cross section through a Glauber [176,177] and Gribov-Regge [178] multiple scattering formalism. This approach has been experimentally verified across a wide range of energies and projectile-target nuclear composition: √ s = 5.02 TeV for proton-lead (pPb) collisions [179], √ s = 2.76 TeV for PbPb collisions [180], and √ s = 57 TeV for pAir [181]. However, verification that this approach also holds for pO (and thus KO(N)) interactions has yet to be realized and is currently the subject of a planned LHC run in 2021-2023 [182]. At high-energies, above √ s 50 GeV, the total hadronhadron cross section as a function of center of mass energy, √ s, is given by where B ab describes the shape and is universal for all hadron-hadron interactions, namely B pp = B πp = B Kp = B pn ≡ B) at high energies; Z ab is a normalization factor dependent on the projectile; and s ab 0 is a scale factor for the collision. High energy πp (up to √ s = 600 GeV) and pp (up to √ s = 50 TeV) data exist and is available to constrain the universality constant B, as well as the scaling of Z ab between projectiles. Ref. [183] finds B Kp = 0.293 ± 0.026 sys ± 0.04 stat mb and Z Kp = 17.76 ± 0.43 mb. At energies above √ s = 40 GeV, the total uncertainty becomes dominated by the uncertainty in the B parameter. By perturbing the total cross section within the uncertainties of B and Z ab , we determine that the uncertainty over the range of interest for this analysis ( √ s ≈ 20 GeV to 500 GeV) is at the few-percent level with a modest dependence on energy.
Recent measurements indicate that the high-energy pp total cross section uncertainty is known to ∼ 3.7% [184] and pPb to within ∼ 3.4% [179], in agreement with the Glauber and Gribov-Regge predictions. We include a conservative estimate on the total kaon-nuclei total cross section of ±7.5%. The shape generated when perturbing the kaon-nuclei total cross section terms by ±1σ is provided in the Supplementary Material panels vii.a and vii.b.

E. Astrophysical neutrino flux
The astrophysical neutrino flux is modeled as having an unbroken "single power law" energy spectrum, equal ν µ toν µ contributions, and isotropic angular distribution. The initial energy distribution of the astrophysical neutrino flux should not be affected by the presence of a sterile neutrino, but the normalization of each flavor component is affected; see Ref. [185,186] for a detailed discussion. Thus, in this analysis, the energy spectrum of astrophysical neutrinos is defined by the added ν µ and antineutrino components normalizations, Φ astro , at 100 TeV and the change in the astrophysical spectral index, ∆γ astro , relative to a central value of γ astro = −2.5, namely The central astrophysical neutrino flux has a normalization at 100 TeV of and ∆γ astro = 0. Both parameters are included as nuisance parameters in this analysis constrained by a correlated uncertainty constructed to span IceCube's various astrophysical neutrino measurements, shown in Fig. 15. This figure also shows three previous single power-law fits to the astrophysical neutrino flux performed by IceCube [110,[187][188][189].
As with the atmospheric neutrino flux, the astrophysical neutrino flux is propagated through the Earth using nuSQuIDS accounting for the in-Earth sterile neutrino oscillation physics, as well as high-energy neutrino attenuation within the Earth. The effects of varying the astrophysical normalization and index are shown in Supplementary Material panels vii.c and viii.a.

F. Neutrino-nucleon interaction
In our energy range, the interaction between the neutrino and matter is dominated by neutrino nucleon deep inelastic scattering (DIS). The neutrino-nucleon cross section enters the analysis in two different parts of the simulation: during the neutrino propagation through the Earth and at the interaction in the proximity of the detector. The latter was previously investigated thoroughly in Ref. [97,98] and found to have a minimal impact on the final event distribution. The effect of the propagation through the Earth required further investigation for this analysis and has now been included as a nuisance parameter.
The neutrino-nucleon DIS cross section increases with neutrino energy, which makes the Earth opaque to highenergy neutrinos. We use the cross sections described in Ref. [127] for both the neutrino-nucleon interaction during propagation and the interaction near the IceCube detector. Uncertainties are provided for both NC and CC interaction channels from 50 GeV to 5 × 10 20 GeV. From approximately 10 TeV upwards, the neutrino and antineutrino ν µ charged-current cross section are predicted to within 2% and 5%, respectively. Below this energy the Earth opacity is negligible. We include separate systematic uncertainties for the neutrinos and antineutrinos, with prior width 3.0% for neutrinos and 7.5% for antineutrinos in order to account for additional potential corrections from nuclear parton distribution function [190]. The uncertainties are implemented via penalized splines constructed over 30 support points in the cross section scaling parameter, ranging from 50% to 150% of the nominal value. The shape in reconstructed energy and cos(θ z ) when perturbing the cross sections for neutrinos or antineutrinos by +10% is shown in the Supplementary Material panel viii.b. As expected the shape changes are primarily localized at high-energy and Earth crossing trajectories. Due to limited statistics in this region the impact of this systematic is small compared to the flux uncertainty, see Fig. 16.

G. Effect on expected sensitivity
In the fit all nuisance parameters are allowed to vary within their constraints at each physics hypothesis point. The freedom allowed by these parameters introduces uncertainty that weakens the sensitivity of the analysis to sterile neutrinos. Fig. 16 quantifies the impact of each source of uncertainty on the expected Asimov sensitivity of each analysis. For each curve, one class of uncertainties (normalization, cross sections, detector response, astrophysical, or conventional flux) is held fixed at its central values while leaving the others to vary within their constraints. The largest impact in both analyses comes from the normalization freedom, followed by conventional flux, and detector response. The effect of cross section and astrophysical flux uncertainties is comparably very small.

H. Posteriors and pulls
The 18 nuisance parameters relating to the conventional neutrino flux, astrophysical neutrino flux, cross sections, and detector uncertainties are fit to data at each point in the parameter space. Table VII shows the minimized values at the best-fit sterile neutrino parameter points for both analyses. Each nuisance parameter includes a Gaussian constraint and central value defined in Table. IV. Figure 17 shows the posterior distribution of each nuisance parameter for our Bayesian analyses at the best-fit points of Analysis I and Analysis II as grey and blue histograms respectively. The posteriors in both analyses are rather similar, reflecting the lack of strong dependence of the nuisance parameter allowed regions on the best fit point. Fig. 18 shows the correlations between each of the 18 nuisance parameters at the best-fit point of Analysis I; Analysis II is not shown, but was found to be largely the same. Correlations between subsets of nuisance parameters are observable, as shown in Fig. 18. For example, we find the conventional flux normalization to be anticorrelated with the cosmic-ray spectral index as well as the atmospheric density; the DOM efficiency to be highly correlated with the ice properties; and the astrophysical normalization to be correlated with the astrophysical spectral index.

VI. RESULTS
Following a staged unblinding and post-unblinding process to check data consistency, the results of Analyses I and II were obtained for the full IceCube data sample. During the unblinding process some a-posteriori changes were implemented to alleviate moderate data / Monte Carlo disagreement at high energies. This is primarily related to updating the prior size on the astrophysical component to match IceCube's recent astrophysical neutrino measurements, as well as expanding treatments of cross section uncertainties, and the kaon energy losses. The analysis result was qualitatively unchanged, following these modifications.
The analysis results are shown as likelihood maps in Fig. 19, with overlaid 90%, 95%, and 99% C.L. contours, calculated assuming Wilks' theorem. Analysis I was found to have a best-fit point at ∆m 2 41 = 4.47 eV 2 and sin 2 (2θ 24 ) = 0.10. The T S compared to the no sterile neutrino hypothesis is −2∆ log L profile = 4.94, corresponding to a p-value of 8% when assuming two degrees of freedom. Analysis II was found to have a bestfit point at sin 2 (2θ 34 ) = 0.40, sin 2 (2θ 24 ) = 0.006, with a −2∆ log L profile = 1.74 corresponding to a p-value of 19% when assuming one degree of freedom. The validity of Wilks' theorem was tested for both analyses at several points along the contour using the Feldman-Cousins method [104], with Analysis I likelihoods following a χ 2 distribution with two degrees of freedom, and Analysis II likelihoods following a χ 2 distribution with two, as expected. A full discussion of these analysis results and interpretation of the closed 90% CL contour in Analysis I follows in Sec. VII.
Using two thousand pseudo-experiments with the nuisance parameters set to the central values, the distribution of best-fit points throughout the parameter space for pseudo-experiments with no injected sterile neutrino signal are determined. These are shown in Fig. 20. The purpose of this test is to establish whether the best-fit point of each analysis has fallen in a location where one might expect it to, given an experiment with no injected  Fig. 17. A description of the priors on each nuisance parameter can be found in Table IV.    signal. As expected, the statistical fluctuations tend to populate best-fit points around the edge of the 90% C.L. sensitivity. The distribution of best-fit points for Analysis I shows a slight clustering at large values of ∆m 2 41 , above ∼ 10 eV 2 . It was found in Ref. [97] that the fast oscillations in this region average out pulling the normalization downward with very little signal shape; e.g. Fig. 3.4.6 of Ref. [97]. This implies that statistical fluctuations of experiments with no true signal can find best-fit points in this region of parameter space, given a modest nor-malization shift. The observed best-fit points for both analyses are shown as the white stars and are found in regions consistent with statistical fluctuations of the no sterile neutrino model.
The data pull relative to best fit (BF) in bin i was calculated as Pull i = (Data i − BF i )/ √ BF i . The 2D gaussian statistical pull distribution of the result compared to the measured best-fit point for each analysis is shown in Fig. 21. We observe the maximum statistical per-bin pull out of 260 bins to be +2.7σ and the minimum to be −2.2σ. The p-value with which one would expect at least one bin perturbed at least this far in either direction, given null realizations, is calculated to be 60.4% for Analysis I and 61.1% for Analysis II based on 10,000 pseudo-experiments, demonstrating that these excursions are not larger than expectations.
The nuisance parameter pulls, defined as Sys Pull = (Fit Value − Prior Center)/Prior Width, at the best-fit points of each analysis are shown in Fig. 22. None of the nuisance parameters, for either analysis, are in tension with their associated priors at a pull greater than ±2.3σ.
Both analyses appear to prefer similar systematic pulls. The largest difference observed is between the measured conventional atmospheric neutrino normalization, where they are within 8% of each other, corresponding to approximately 1.1σ given the posterior width. It is also noted that the posterior width of the neutrino-nucleon cross section is identical to the prior width, indicating that we do not have significant sensitivity to this particular source of systematic uncertainty. in sin 2 (θ 24 ) is recorded. The distribution of the crossing values for sin 2 (θ 24 ) are then used to define the 68.3% (1σ) and 95.4% (2σ) confidence intervals. If the contour crosses more than once, we take the maximum sin 2 (θ 24 ) value of the crossing. This procedure is performed for each value in of ∆m 2 41 , for both the 90% CL and 99% C.L. contours. The resulting distributions produce "Brazil bands" and the median sensitivity values. The width of the Brazil band indicates the expected scale of statistical variations of the result over repeated pseudo-experiments, given no injected signal. Comparison of this width to the scales of effects from adding or removing systematic uncertainties provides a semi-quantitative method to define whether the analyses are statistically or systematically limited. These bands for both analyses at the 99% C.L. are shown in Fig. 24.
In both analyses, the scale of the effects of systematic uncertainties remain significantly smaller than the scale of statistical fluctuations of the final result embodied in the sensitivity interval, which is an indication that both   analyses remain statistics limited. The 99% C.L. contour is relatively consistent with its sensitivity envelope, largely enclosed within the 95% region for both analyses. The 90% C.L. contour is closed in Analysis I. By construction, comparison of the 90% contour with the Brazil band should be made using its rightmost edge, which also appears consistent with the sensitivity from pseudo-experiments. Also by construction, a closed 90% C.L. contour is expected in approximately 10% of pseudo-experiments, and the best-fit point of Analysis I falls in a location consistent with expectations from null realizations. The p-value of the likelihood at this best-fit point is 8% relative to the no sterile neutrino model. We therefore conclude that this particular data realization is unexceptional relative to results of pseudo-experiments generated under the no sterile neutrino hypothesis. Fig. 25 shows the impact on the result after removing various groups of systematic uncertainty categories from the analyses. This tests whether the result is especially sensitive to any specific group of systematic effects. The solid (dashed) lines in these figures show the 90% C.L. (99% C.L.) and the stars represent the best-fit parameters location. We find the main analysis results are robust in all cases. Fig. 26 shows the impact of removing any one year of data, to test for the effects of time-localized excursions on the result. The contour moves due to statistical fluctuations for each entry, but its shape is broadly unchanged when removing any one year of data, demonstrating stability against time-localized statistical or systematic fluctuations.

VIII. TESTS OF RESULT ROBUSTNESS
Other studies are also made to test for result robustness, including loosening priors on the systematic uncertainties with the largest pulls and testing consistency of each year of data one-by-one with the total accumulated data set. In all cases, strong consistency is observed.

IX. DISCUSSION
There are three presently published results from Ice-Cube on the 3+1 sterile neutrino model in a comparable parameter space to Analysis I. The three year DeepCore result [92]  region from approximately 0.1 eV 2 ≤ ∆m 2 41 ≤ 2.0 eV 2 above sin 2 (2θ 24 ) = 0.1, extending to approximately sin 2 (2θ 24 ) = 0.016 and ∆m 2 41 = 0.27 eV 2 . Alongside the publication of the full-detector (IC86) result, an independent measurement using a partial IceCube configuration with 59 active strings (IC59) was reported at 99% C.L.. These results are collected and compared with the result of this analysis and other world data in Fig. 27.
The result of Analysis I shown in blue is in good agreement with the previous IceCube limits at 90% C.L.. The 99% C.L. exclusion region over sterile neutrino mixing parameters is expanded relative to previous analyses. In the region below ∆m 2 41 = 0.1 eV 2 , the confidence reaches down to a factor seven smaller mixing amplitudes, largely due to the improved statistics at low energies.
The three-year DeepCore sterile neutrino analysis has also placed limits in the comparable space to Analysis II. We find that the result of Analysis II improves the limit on the sterile neutrino mixing parameters below approximately sin 2 (2θ 34 ) = 0.4. Here, the confidence interval is shown to increase by a factor ranging from two to approximately five. This comparison is shown alongside other world data in Fig. 28.
Prior measurements of ν µ disappearance have been made by MINOS [66][67][68][69], NOνA [191], DeepCore, Super-Kamiokande [71], MiniBooNE-ScibooNE [72,73], and CDHS [70]. One can also compare these results to the results of global fits. The 99% C.L. limit excludes part of the allowed region from Ref. [192], and the lower island from Ref. [193]. The best-fit point from Ref. [193] is centrally within the allowed region at 90% C.L.. Despite the existence of a non-trivial allowed region in Analysis I, comparison with the preferred region from appearance experiments where ν µ → ν e anomalies are observed shows a strong tension with the IceCube result, as it does with all other ν µ or ν µ disappearance searches. The increased extent of the 99% contour in the relevant parameter space suggests that, despite finding a closed 90% contour, this result is in increased tension with the allowed region from appearance experiments, relative to the previous IceCube result.
The equivalent comparison to world data for Analysis II is shown in Fig. 28. Here, data is compared at the 90% C.L. (top) and 99% CL (bottom) to other results in this parameter space from Super-Kamiokande [71] and DeepCore [92]. This analysis provides world leading limits in the region ∆m 2 41 ≥ 10 eV 2 from approximately 0.024 ≤ sin 2 (2θ 34 ) ≤ 0.54 and 0.012 ≤ sin 2 (2θ 24 ) ≤ 0.16.

X. CONCLUSION
We have presented a detailed description of an eightyear search for sterile neutrinos in two parameter spaces. The result uses a new high-purity and high-efficiency event selection for upward-going track-like events, and incorporates detailed treatments of systematic uncertainties stemming from ice properties, detector response, atmospheric, and astrophysical neutrino fluxes.
The results obtained by analyzing 305,735 atmospheric and astrophysical ν µ andν µ events are used to generate confidence intervals in the space of ∆m 2 41 vs. sin 2 (2θ 24 ) assuming θ 34 and all CP -phases to be zero (Analysis I) and in the space of sin 2 (2θ 24 ) vs. sin 2 (2θ 34 ) for ∆m 2 41 ≥ 20 eV 2 and again assuming all CP -phases to be zero (Analysis II). In both parameter spaces, strong exclusions are obtained at 99% C.L., increasing tensions with the global preferred regions from appearance experiments. A closed contour is observed at 90% C.L. in Analysis I, which includes parts of the allowed regions from global fits to world data. The best-fit likelihood is found to be consistent with fluctuations of the no sterile neutrino model with a p-value of 8%, and the best-fit point is unexceptional relative to observed closed-contour results obtained from pseudo-experiments. However, a consistent result obtained with each year of data is suggestive of a small systematic effect rather than a fluctuation of purely statistical origin. Therefore, while this result is not considered as strong evidence for sterile neutrinos, it is likely to be impactful on the landscape of 3+1 global fits due to its high statistical power in the relevant parameter space.

SUPPLEMENTARY FIGURES
In this section we present, as supplementary material, analysis results for each year of IceCube data fit independently, showing consistency of the results (Suppl. Figs. 1 and 2) and the effects of varying each one of the nuisance parameters, as referred to in the main text (Suppl . Tables I,II, and III           SUPPL. TABLE III. Effects of systematic nuisance parameters in analysis space (see text for details)