Cosmological implications of the KOTO excess

The KOTO experiment has reported an excess of $K_L\to\pi^0\nu\bar\nu$ events above the standard model prediction, in tension with the Grossman--Nir bound. The GN bound heavily constrains new physics interpretations of an excess in this channel, but another possibility is that the observed events originate from a different process entirely: a decay of the form $K_L\to\pi^0X$, where $X$ denotes one or more new invisible species. We introduce a class of models to study this scenario with two light scalars playing the role of $X$, and we examine the possibility that the lighter of the two new states may also account for cosmological dark matter. We show that this species can be produced thermally in the presence of additional interactions apart from those needed to account for the KOTO excess. Conversely, in the minimal version of the model, dark matter must be produced non-thermally. In this case, avoiding overproduction imposes constraints on the structure of the low-energy theory. Moreover, this requirement carries significant implications for the scale of reheating in the early universe, generically preferring a low but observationally-permitted reheating temperature of O(10 MeV). We discuss astrophysical and terrestrial signatures that will allow further tests of this paradigm in the coming years.

On the experimental side, several K + → π + νν candidate events have been observed by the E787/E949 experiment [2,5,6] and the NA62 experiment [18], but a discovery of K + → π + νν has still to be established. The current best limit on the branching ratio is from a preliminary analysis of NA62 data and reads [47] BR(K + → π + νν) exp < 2.44 × 10 −10 (95% C.L.), not far above the SM prediction. The NA62 experiment aims to measure the SM branching ratio with O(10%) uncertainty. In the case of K L → π 0 νν, the current most stringent bound on the branching ratio comes from the KOTO experiment [3], and is still two orders of magnitude above the SM prediction: BR(K L → π 0 νν) exp < 3.0 × 10 −9 (90% C.L.). (4) Interestingly, in the latest status update by KOTO [49], 4 events are seen in the signal box, with an expected number of 0.05 ± 0.01 SM K L → π 0 νν events and 0.05 ± 0.02 background events. One of the events has been identified as likely background. If the remaining events are interpreted as signal, one finds a branching ratio of BR(K L → π 0 νν) ∼ 2 × 10 −9 [40].
A branching ratio of this size would be a spectacular discovery. Not only does it imply NP, it also violates the Grossman-Nir (GN) bound [31], BR(K L → π 0 νν) 4.3 × BR(K + → π + νν) 10 −9 , when combined with the NA62 constraint in eq. (3). The GN bound is very robust in models where the K → πνν decays are modified by heavy new physics well above the kaon mass. However, in the presence of light new physics, the GN bound can be violated and the observed events at KOTO may find an explanation [19, 21, 23, 25, 30, 36-38, 40-42, 52].
Here we focus on a new physics scenario first discussed in [45]. Two new light scalars S and P , neutral under the SM gauge interactions, are introduced such that K L can decay into a pair of the new particles, K L → SP . If the decay S → π 0 P is allowed and P is stable on the relevant experimental scales, then the decay chain K L → SP → π 0 P P can mimic the K L → π 0 νν signature (see fig. 1). The corresponding chain of two-body decays does not exist for the charged kaon. A possible decay K + → π + SP is suppressed by three-body phase space or may be forbidden entirely by kinematics.
If P is absolutely stable, it is also a candidate for cosmological dark matter. In the minimal setup that can provide a NP explanation of the KOTO events, P couples to the SM very weakly, implying that annihilation cross sections into SM states are too small for production by freeze-out. We therefore investigate alternative scenarios for cosmological production, and interpret overproduction of P as a cosmological constraint on the structure of the low-energy theory. We show that P is readily produced non-thermally if the scale of reheating is low, close to but safely above the current observational bound. We also show that this class of models can account for the KOTO excess without requiring a low reheating temperature, but only in the presence of additional interactions. We investigate prospects for testing this model with future experiments and with additional data from KOTO, and show that much of the parameter space will be probed in the near future. This paper is organized as follows: in section II, we present the model and discuss how it can explain the KOTO events. In section III, we evaluate astrophysical and terrestrial constraints on the parameter space of our model. In section IV, we consider cosmological production of P , and relate the production of P to the scale of reheating. We discuss the implications of our results in section V and conclude in section VI.

II. MODEL
We start with very simple kinematical considerations concerning the masses of the two scalars S and P . Figure 2 shows the plane of the two scalar masses m S and m P . As described in the introduction, we are interested in regions of parameter space where the decay K L → π 0 P P , which mimics K L → π 0 νν, can be realized as a sequence of the twobody decay K L → SP followed by S → π 0 P . For m S too large, the decay K L → SP is kinematically forbidden, while for m S too small, the S → π 0 P decay is not open, excluding the dark gray regions in the plot. In the light gray region one faces potential constraints from the charged kaon decay K + → π + SP that is generically expected in the models discussed below. In the white region, however, this decay is kinematically forbidden, while K L → π 0 νν remains open.
The plot also indicates two other interesting kinematical boundaries. If m P < m π 0 /2, the exotic pion decay π 0 → P P is possible which, as we will discuss in section IV, can impact cosmological production considerably. If m S > 3m P , the decay S → 3P can be allowed, thus modifying the lifetime of S, which is a crucial parameter for beam dump constraints.
Note that low P masses may be subject to constraints from supernova cooling, which we will discuss further in section III A. A weaker lower bound on the P mass also follows from assuming a particular thermal history, a point to which we shall return in section V.
In the following sections, we will discuss four benchmark parameter points covering the most interesting regimes: Next we discuss in detail the interactions of S and P with SM quarks. We first focus on non-renormalizable effective couplings and identify viable regions of parameter space. Then we comment on simplified UV models that map onto the effective couplings. In the dark gray region the K L → π 0 P P decay cannot be realized as a sequence of 2-body decays. In the light gray region the K + → π + SP decay is open. The black dots indicate four benchmark scenarios that we consider later (eq. (5)).

A. Effective interactions of the scalars and meson decay rates
We assume that the scalars S and P interact with SM particles via the effective couplings The factors of i in the above Lagrangian are reminiscent of considering S to be a CP-even scalar and P to be a CP-odd pseudoscalar, a notational pattern that we will retain when matching onto low-energy QCD later on. The coefficients g SP dd , g SP ss ,g SP dd , andg SP ss are purely imaginary (by hermiticity of the Lagrangian) while the g SP sd andg SP sd coefficients can have an arbitrary complex phase. There could also be interactions involving b quarks, but as long as they are not considerably larger than the interactions with the light quarks, their impact on phenomenology will be negligible.
In the following, we will also entertain the possibility of additional interactions involving P 2 and S 2 , of the form While the interactions in eq. (7) are not directly relevant for the KOTO signal, they do have important implications for other meson decays and in particular for the dark matter phenomenology as we will discuss in section IV below.
The decays relevant for an enhanced KOTO signal, K L → SP and S → π 0 P are induced by the couplings Re(g SP sd ) and Im(g SP dd ), respectively. For the corresponding decay rates we find with the phase space function λ(a, b, c) = a 2 + b 2 + c 2 − 2(ab + ac + bc). The down and strange quark masses in the above expressions should be interpreted as the MS masses at a renormalization scale of µ = 2 GeV. Leading-log QCD corrections are then taken into account through the factor η QCD where M is the scale of new physics that is responsible for the effective interactions of S and P with the SM quarks. Because of SU GeV is the vacuum expectation value of the SM Higgs. Note that including the η QCD factor is equivalent to evaluating the down and strange masses in eqs. (8) and (9) at the scale M .
The coupling |g SP sd | can lead to the decay K + → π + SP , if kinematically allowed. The differential 3-body decay rate of K + → π + SP is given by where we estimated the relevant scalar form factor as π + |sd|K + (m 2 K + − m 2 π + )/m s and q 2 is the invariant mass of the SP system, with (m P + m S ) 2 < q 2 < (m K + − m π + ) 2 .
Similarly to the K L → SP decay, the interactions in eq. (6) also lead to the exotic eta decay η → SP , which has been identified as a possible source of the scalar S at beam dump experiments [37]. Neglecting η-η mixing, we find For completeness, we also provide the expression for the decay K S → SP : In the presence of the P 2 interactions in eq. (7), there are additional exotic meson decays, π 0 → P P , η → P P , K L/S → P P , and K + → π + P P , with the following decay rates: Γ(K L → P P ) = 1 4π In the K + → π + P P decay width, q 2 denotes the P P invariant mass, which lies in the range The interactions of S and P with quarks that we have introduced preserve a Z 2 symmetry under which S and P are odd, while all SM particles are even. We assume that the Z 2 symmetry is also respected by the scalar potential, such that P is an absolutely stable dark matter candidate. Among the allowed Z 2 symmetric terms in the scalar potential, the SP 3 will turn out to be relevant. When kinematically allowed, this interaction leads to the decay where f is the three-body phase space integral, which is normalized to 1 in the limit y → 0. The S → 3P rate will modify the lifetime of S and can therefore have a crucial impact on possible constraints from beam dump experiments.

B. Events at the KOTO experiment
The model introduced in the previous section will lead to K L → π 0 P P events at the KOTO experiment. We now identify the regions of parameter space in which this decay can mimic the KOTO signal.
The number of events that can be expected to be detected at KOTO can be written as where BR(K L → π 0 νν) SM = (3.4 ± 0.6) × 10 −11 is the SM prediction for the K L → π 0 νν branching ratio [12,14], N SM = 0.05 ± 0.01 is the expected number of SM signal events at KOTO [49], and is the ratio of acceptances of the considered model signal and the SM signal at the KOTO detector. As has been pointed out before [30,37,40], an exotic contribution to the KOTO signal (in our case K L → SP → π 0 P P ) can have a considerably different acceptance.
We determine the acceptance ratio R using a Monte Carlo simulation. Details are provided in appendix A. The result is given in fig. 3, which shows R as a function of the S lifetime for our four benchmark points (eq. (5)). For prompt decays, τ S → 0, we find  In our setup, the lifetime of S is determined by the S → π 0 P and S → 3P decays. In the four benchmark cases for the scalar masses defined above we find where in the S → π 0 P decay width we have defined Λ dd = Λ NP / Im(g SP dd ). Note that S → 3P is not kinematically allowed in benchmark BM3. The S → π 0 P branching ratio is given by Finally, we find the following K L → SP branching ratios where we have defined Λ sd = Λ NP / Re(g SP sd ). couplings that are in principle unrelated. If we assume that the coupling g P 2 sd (corresponding to (sd)P 2 ) is of the same order as the couplingg SP sd (corresponding to (siγ 5 d)SP ), we find relevant constraints from the searches for K + → π + νν. To evaluate the constraints we compare the predicted K + → π + P P branching ratio with the bound from NA62 given in eq. (3). We correct for the different signal acceptances of K + → π + P P compared to K + → π + νν that arise due to kinematical cuts on the missing mass and the charged pion momentum. For the three P masses relevant to our benchmarks, we find the bounds BR(K + → π + P P ) < 2.7 × 10 −10 for m P = 10 MeV, BR(K + → π + P P ) < 3.5 × 10 −10 for m P = 100 MeV, and BR(K + → π + P P ) < 2.4 × 10 −9 for m P = 125 MeV. Setting Λ NP /|g P 2 sd | = Λ NP / Re(g SP sd ) = Λ sd , we find that in fig. 5, the regions left of the dotted vertical lines are excluded.
If we assume that the couplingg P 2 dd (corresponding to (diγ 5 d)P 2 ) is of the same order as the couplingg SP dd (corresponding to (diγ 5 d)SP ), we find relevant constraints from the invisible branching fraction of the neutral pion, BR(π 0 → inv.) < 4.4 × 10 −9 [47]. Setting

KL→inv.
FIG. 5. Number of expected K L → SP → πP P events at KOTO in the Λ sd -Λ dd plane for three benchmark points of the S and P masses. The SP 3 coupling is set to λ SP 3 = 10 −5 . The right vertical axis indicates the S lifetime, which is approximately constant for Λ dd > 10 7 GeV. One Λ NP / Re(g P 2 dd ) = Λ NP / Im(g SP dd ) = Λ dd in the benchmarks BM1 and BM4, the regions below the dotted horizontal lines are excluded. For benchmarks BM2 and BM3, the P mass is too large for the π 0 → P P decay, so the couplings are therefore completely unconstrained by BR(π 0 → inv.).

C. Simplified UV models
The higher dimensional interactions in eq. (6) that lead to the exotic meson decays can be UV completed by simplified models in various ways. In this section, we discuss briefly two possibilities: (1) vector-like quarks and (2) an inert Higgs doublet.

Vector-like quark model
We introduce two sets of heavy vector-like quarks D and Q which have quantum numbers of the right-handed down quark singlets, D = (3, 1) − 1 3 , and of the left-handed quark doublets Q = (3, 2)1 6 , respectively. These quantum number assignments admit the following terms in the Lagrangian: The first line contains the masses m Q and m D for the vector-like quarks, as well as interactions with the SM Higgs doublet h. The masses m Q , m D and the couplings Y QD , Y DQ are in general complex parameters. However, not all of their phases are observable. Using the freedom to re-phase the vector-like quark fields, we will choose real m Q , m D and Y QD without loss of generality. The second line in eq. (27) contains couplings of the SM down and strange quarks with S and the vector-like quark D as well as with P and the vectorlike quark Q. The couplings X Dd , X Ds , Z Qd , and Z Qs contain physical phases.
Note that the above Lagrangian is invariant under a Z 2 symmetry under which all SM particles are even, while the vector-like quarks as well as S and P are odd. Thus P remains an absolutely stable dark matter candidate. In addition to the couplings shown, the model could also contain Z 2 invariant couplings involving S and Q or P and D. However, such couplings are not required to generate the desired low energy interactions and we will neglect them in the following.
Integrating out the vector-like quarks at tree level (see fig. 6, left diagram), and matching onto the effective Lagrangian of eq. (6), we find As required by SU (2) L invariance, the effective interactions g SP ij /Λ NP andg SP ij /Λ NP are proportional to the SM Higgs vev v 246 GeV. If all couplings X ij , Y ij , Z ij are of O(1), we can expect vector-like quark masses m Q,D ∼ √ Λ NP v ∼ 10 6 GeV. The couplings above are not all independent but obey the relation One therefore expects that the flavor changing couplings are of the order of the geometric mean of the flavor conserving couplings.
The vector-like quarks also give 1-loop contributions to kaon mixing. We checked explicitly that those contributions scale as v 2 /(m 2 Q m 2 D ) and are completely negligible.

Inert Higgs doublet model
In a second scenario, we introduce an inert Higgs doublet H with mass m H , which couples to down and strange quarks, the SM Higgs, and the scalars S and P through the following interactions: As in the vector-like quark scenario, this inert Higgs Lagrangian is invariant under a Z 2 symmetry: S and P are odd, while all other particles are even. Additional Z 2 symmetric quartic couplings of the inert Higgs involving e.g. S 2 or P 2 are also possible but are not required to generate the low energy interactions in eq. (6), and we neglect them in the following.
Integrating out the inert Higgs at tree level (see fig. 6, right diagram), and matching onto the effective Lagrangian of eq. (6), we find In addition, integrating out the inert Higgs gives 4-fermion contact interactions of the type that modify kaon oscillations. We find the following contributions to the kaon mixing matrix element: where B 4 0.78 [15] (see also [17,26]) and η QCD is the QCD correction factor given in eq. (10), with M = m H . Modifications to the mixing matrix alter the neutral kaon oscillation frequency ∆M K and the observable K that measures CP violation in kaon mixing. The above contribution to M 12 modifies these two quantities as Taking into account the SM predictions ∆M SM K and SM K from [11,13], and the corresponding experimental values from [50], we find the bounds Assuming |Y ds | |Y sd | and O(1) CP violating phases, the kaon mixing bounds are compatible with Λ sd 3 × 10 9 GeV. Also, note that the bounds are entirely avoided if either of Y sd or Y ds is set to zero.

III. ASTROPHYSICAL AND TERRESTRIAL CONSTRAINTS
We now consider extant astrophysical and terrestrial constraints that may apply to our model.
First, anticipating our treatment of P as a dark matter candidate, we note that direct detection, indirect detection, and self-interaction constraints are not relevant for our model in its minimal configuration (see eq. (6)). If our P is the cosmological dark matter, but the SM is only coupled to the current SP , then direct detection is only sensitive to the inelastic scattering process P + SM → S + SM, which is kinematically forbidden unless the dark matter is boosted. Similarly, indirect detection and self-interaction processes require two vertices, and thus the cross sections are suppressed by Λ 4 NP . Extensions of our minimal model containing couplings to P 2 (see eq. (7)) may be subject to these constraints due to the presence of additional interactions. However, we first treat constraints from supernova cooling and beam dump experiments, which apply directly to the minimal model.

A. Supernova constraints
Supernova cooling provides powerful constraints on new weakly-coupled light particles.
Evaluating these bounds properly requires a detailed analysis that lies beyond the scope of this work, but we can perform an order-of-magnitude estimate to determine the regions of our parameter space that are likely to be subject to such constraints.
In the case of axions, the cross section for axion production N N → N N a is constrained by SN1987A to lie in the range [46] 3 × 10 −20 σ GeV −2 10 −13 .
Below the lower limit, axions are not produced in sufficient numbers to affect the cooling process. Above the upper limit, the produced axions are trapped within the supernova environment, and are unable to cool the system more effectively than neutrinos. Many details of the calculation for axions should be modified in our case, but we will make a crude estimate of the constraints by requiring our production cross section to lie in the same range.
Since P is stabilized by a Z 2 symmetry, it can only be produced in pairs, or in association with S. The process N N → N N P P is mediated at the loop level in the minimal model, involving two insertions of the effective interaction vertex. Since T SN 30 MeV [46], we estimate the cross section as lying below the constrained range of cross sections, even neglecting exponential suppression when m P T SN . In the case of SP production, N N → N N SP , since m S T SN , we estimate the cross section as While parts of our parameter space are thus expected to be unconstrained by supernova limits, it is important to note that if m P is small, or if Λ dd 10 6 GeV, the estimated production cross section enters the prohibited range. In particular, if Λ dd = 10 6 GeV, then avoiding the bound requires m S + m P 450 MeV, favoring the larger P masses in fig. 2.
However, in this naive projection of supernova constraints, our model remains viable in a wide region of the parameter space.

B. Beam dump constraints
In minimal form, our model of the KOTO excess is potentially subject to constraints from long-lived particle searches: the partial decay width of S → π 0 P is bounded from below by the observed KOTO event rate, so in the absence of additional interactions, the  (2) what fraction of these undergo S → π 0 P within the decay volume.
First we estimate the number of S particles produced. There are at least two channels to consider: direct production from nucleon-nucleon scattering, and secondary production from kaon and other meson decays. Observe, however, that the fraction of proton-proton collisions that produce an SP pair is of order (s/Λ NP ) 2 /α 2 S , which is much smaller than the branching ratios BR(K L → SP ) and BR(K S → SP ) implied by our interpretation of the KOTO excess. We also checked that the number of S from eta decays η → SP is small compared to those coming from the kaon decays in our scenarios.
Given N p protons on target, we expect that of order N K ∼ 10 −2 N p kaons are produced [30], and this is sufficient for kaon decays to dominate production. However, of these kaons, most will be stopped or scattered away from the axis of the beam before they decay. The dynamics of kaon energy loss and deflection in materials are complicated, but the nuclear interaction length for relativistic kaons in most materials is L nuc ∼ O(10 cm) [50], so we will assume that any kaons traveling this far before decaying are sufficiently slowed down or deflected such that only a negligible fraction of the S particles are directed towards the detector. Thus, the number of S particles produced and directed towards the detector is of order where γ is the boost factor. Now, accounting for the fraction of S particles which decay in the decay volume, and accounting for the efficiency of the detector, the number of events is given by In the minimal scenario, with no additional interactions, BR(S → π 0 P ) = 1.
We now estimate the event counts in the CHARM [7] and NuCal [10] experiments. that would be produced by our model, we set γ K X = γ S = 10 for CHARM and reduce these proportionally for NuCal's lower beam energy.
Assuming BR(S → π 0 P ) = 1, the resulting event count is shown as a function of the S lifetime in fig. 7. The minimum expected number of events at long S lifetime is large unless While the interaction terms containing (qiγ 5 q)P 2 give rise to suppressed velocity-dependent cross sections off of nucleons, the operators (qq)P 2 with q = d, s produce potentially detectable scattering off of nucleons. We define the integrated nucleon form factors where f N q are the form factors for nucleon N of quark q [39]. The direct detection cross section can be cast as Using the central values B p d ≈ 6.77 and B p s ≈ 0.50, it is clear that the dominant effect is scattering off of d quarks if g P 2 ss g P 2 dd . The scattering cross section off of protons is then i.e., close to 0.1 pb. Cross sections of this order are above the expected neutrino background, and are within reach of future planned experimental sensitivity [22]. We will return to direct detection prospects in section V.

IV. COSMOLOGICAL PRODUCTION
We now turn to the question of cosmological production of the dark matter candidate P : which scenarios allow P to be produced with the observed dark matter density?
The standard thermal freeze-out paradigm is not viable in our minimal scenario. Esti-mating the freeze-out temperature by n P σ(P P → SM) ∼ H(T ), we have where Λ NP is the scale of new physics in question-for practical purposes, the lesser of Λ sd and Λ dd . For typical values of Λ NP consistent with the KOTO excess, T FO m P , so P freezes out as a hot relic, with relic abundance Thus, for the masses and couplings considered in this work, P is generically overproduced in the freeze-out scenario. If the P mass were small enough to be produced with the right relic abundance, then P would be ruled out as a dark matter candidate because of structure formation constraints on relativistic relics.
Departing from the minimal scenario outlined above opens up the possibility that an additional effective interaction with SM species keeps P in thermal equilibrium, and that the P relic abundance is set by thermal decoupling (freeze-out). Since generally thermal decoupling happens at temperatures T ∼ m P /25, in order to avoid possible constraints from BBN, one can assume that the effective interaction only involves SM neutrinos: For the effective dimension five operator in the equation above, we find that the zero-velocity thermally averaged product of the pair-annihilation cross section and relative velocity is A standard treatment of the relic abundance for the pair-annihilation cross section above indicates that P would be produced in the right amount if Λ νν 7 TeV. This is several orders of magnitude above current limits for dark matter interactions with SM neutrinos, independent of flavor [9]. Thus, if P were in equilibrium at high temperatures, an effective interaction with SM neutrinos-which, incidentally, can be quite naturally embedded in the UV completions described above-could suppress the P abundance to an acceptable relic density in agreement with observations.
In the absence of the additional neutrino portal described in the paragraph above, the only alternative is production via freeze-in [32]. Here the dark species is produced out of equilibrium by some standard model species, and the abundance increases until cosmological expansion halts production. It is thus possible to avoid overproduction of dark matter with extremely small couplings. Note that while other mechanisms might allow for additional production of P , the freeze-in contribution is unavoidable in the range of temperatures where our effective theory is valid.
Typically, freeze-in is applied to a UV-complete theory, where the dark matter production rate can be computed starting at very high temperatures. In the context of a renormalizable model, it can be shown that dark matter is produced primarily at lower temperatures, so the details of the UV physics are unimportant. Thus, freeze-in can be used to consistently calculate the non-thermal relic abundance, even though a formal dependence on initial conditions remains. Note that this is in contrast to the freeze-out paradigm, where equilibrium with the standard model bath erases any non-trivial initial conditions in the dark sector.
However, in our scenario, the dark matter is produced through non-renormalizable interactions, and the standard freeze-in mechanism cannot be directly applied: our effective theory cannot be applied at scales above some O(Λ NP ) cutoff. At first, this does not seem to be a problem: in standard freeze-in, production is IR-dominated, and we can apply our effective theory in this regime. But for higher-dimension operators, production is no longer IR-dominated, and it is no longer possible to self-consistently estimate the relic abundance unless an initial condition is fixed at a temperature where the effective theory is valid.
Naively, one can place a lower bound on the relic abundance by fixing the dark matter abundance to zero at T ∼ Λ NP and computing the amount of dark matter produced at lower temperatures, where the effective theory is valid. However, as we shall see in the following section, this still leads to overproduction of P . Thus, in our model, it would seem that dark matter is overproduced in the freeze-in scenario, even with the most favorable initial conditions.
There is, however, a significant loophole in this argument: setting the dark matter abundance to zero at T ∼ Λ NP is in fact not the most favorable initial condition. If reheating occurs at a temperature T rh Λ NP , then the dark matter abundance should be set to zero at this lower temperature, allowing for a much lower relic abundance. There is nothing particularly unnatural about this scenario: in general, freeze-in production of dark matter depends on the reheating temperature. This dependence is weak if the reheating scale happens to be much higher than any scale in the theory, but the convenience of this arrangement does not constitute evidence for it. Moreover, if T rh Λ NP , then our effective theory can be used to self-consistently compute the dark matter relic abundance independently of any UV completion. This paradigm is known as UV freeze-in [24].
A. Computing the yield First, we briefly review the computation of the dark matter relic abundance in the standard freeze-in paradigm. The basic technology of UV freeze-in is identical to that of standard freeze-in, but the initial condition is fixed at the reheating temperature T rh , which becomes an important free parameter of the theory. In certain scenarios, the dark matter yield is quite sensitive to temperatures near T rh , and decreasing T rh can significantly reduce the relic abundance.
The starting point is the Boltzmann equation, Here n χ denotes the number density of a dark species χ, I and F index initial and final states, N χ (S) denotes the number of χ particles in the state S, dΠ i = g i d 3 p i /(2π) 3 2E i , |M I→F | 2 is the spin-averaged squared matrix element, and f k is the phase space density of the species k.
In freeze-in, one assumes that the phase space density of the dark species is always small, so that any initial state with N χ (I) > 0 makes a negligible contribution in eq. (49). If all of the initial-state species are now in equilibrium, the phase space densities f i can be replaced with equilibrium distributions e −E i /T . Now eq. (49) reads We will be interested in two types of processes: 1 → 2 decays and 2 → 2 scattering. In the 1 → 2 case, with a process i → χf , we set µ = m i , i.e., x = m i /T . We recognize the decay width Γ i→χf in eq. (50), which becomes where K 1 is a modified Bessel function of the second kind, and now N χ (F ) is either 1 or 2, depending on whether f = χ. Substituting H = 1.66g 1/2 x −2 m 2 i m −1 Pl , the total yield can now be computed by performing a 1-dimensional integration of eq. (51), as In particular, suppose that f = χ, m χ m i , and |M i→χχ | 2 = λ 2 . If production mainly takes place during an epoch when g and g S are not changing rapidly, then we can estimate the yield as Similarly, if the abundance of χ is set by 2 → 2 processes of the form ij → χf , then the integrals over the final-state phase space produce the cross section σ ij→χf , and eq. (50) becomes This remaining integrals can be reduced to a single 1d integral, following e.g. [29]. Integrating in x, the yield is then where m ± = |m i ± m j |, r ± = s − m 2 ± 1/2 , and s min = min(m i + m j , m χ + m f ) 2 . As in the 1 → 2 case, we can estimate the yield analytically for a process ii → χχ when m i m χ and the evolution of g and g S is negligible. If |M ii→χχ | 2 = λ 2 , then the result is where x min = m i /T max . The analogous expression for m χ m i is obtained by interchanging m i and m χ and taking µ = m χ (i.e. x min = m χ /T max ). However, in our model, 2 → 2 processes are driven by effective 4-point vertices suppressed by a scale Λ NP , so we should instead set |M ii→χχ | 2 = s/Λ 2 NP . In this case, the result is This demonstrates a key difference between standard freeze-in and UV freeze-in: a naive extrapolation of the production rate to arbitrarily high temperatures (small x min ) diverges.
Of course, one should not expect to accurately compute the production rate in the effective theory at T Λ NP . But even so, if Λ NP T max max{m χ , m i }, then production can dominated by 2 → 2 processes, whereas 1 → 2 decays typically dominate in standard freezein. In our case, m χ and m i are MeV-scale, while Λ NP 10 6 GeV. Thus, production by 2 → 2 processes at high temperatures is potentially very significant.
Using the approximate forms of the yield derived above together with the dark matter abundance today, Y χ (∞) ≈ 2 × 10 −6 (m χ /MeV), we can estimate the ranges of parameters which account for all of dark matter-or, at least, those which do not overclose the universe.
If dark matter in our model is produced dominantly by quark annihilation via an interaction of the form Λ −1 dd d(iγ 5 )dSP , then the only important parameters are Λ dd and x min . Note that if this is the only interaction at work, there is no contribution from decays. However, production at such low temperatures introduces a new complication: our simplistic estimates above have presumed not only that production is dominated by 2 → 2 processes, but also that the initial state consists of free quarks. If T rh < 100 MeV, then quarks are confined into hadrons during the entire production period. One must then modify the effective couplings to account for hadronic scattering, and since the initial and final states are all (pseudo)scalars, the matrix elements no longer carry any s-dependence. Additionally, since single hadrons can now decay to S and P , hadronic decays can dominate the relic abundance, and must be included in the calculation of the yield.
In the following section, we treat these issues in detail and calculate the relic density numerically.

B. Determining the reheating temperature
Our estimates in the previous section suggest that P can be produced non-thermally, and can account for all of dark matter, if the initial temperature of the SM bath is between 100 MeV and 10 MeV. We now refine our estimate of the yield to account for confinement and hadronic decays, and then numerically compute the yield to establish the required reheating temperature in our model.
At T 200 MeV, quarks are confined into hadrons, and the effective interactions of the hadrons with S and P are well described by chiral perturbation theory (chiPT). The effective couplings of hadrons to S and P are built from a combination of the new physics scales and QCD parameters. Since the couplings in the quark-level effective Lagrangian are proportional to Λ −1 NP , and the hadron-level 1 → 2 coupling must have mass dimension 1, the latter must be of order Λ 2 chiPT /Λ NP , where Λ chiPT is some scale associated with lowenergy QCD. Similarly, in the 2 → 2 case, the hadron-level coupling should have the form Λ chiPT /Λ NP . As we will see momentarily, Λ ( ) chiPT is a combination of two constants, f π ≈ 92 MeV and B 0 ≈ 2666 MeV. To determine the couplings explicitly, we match our effective quark-level Lagrangian onto the chiPT Lagrangian following [27,44]. Our application of this method to light scalars is also similar to the treatment in section 3.1 of [52].
The interactions of QCD degrees of freedom with our light scalars can be written as the couplings of quarks to external currents s and p, respectively a scalar and pseudoscalar.
These take the form Interactions of hadrons with these currents enter the chiPT Lagrangian via the current χ = 2B 0 (s + ip). At lowest order, we have where Φ is the PNGB matrix [see e.g. 44]. Now consider a quark-level interaction of the where O S is a scalar (CP-even) and O P is a pseudoscalar (CP-odd). We can then identify Substituting these expressions into eq. (61) with O S = S 2 , P 2 , and O P = SP gives the interactions of S and P with the PNGBs. For instance, the interactions of S and P with π 0 are specified by where the ellipsis denotes a series of higher-dimensional operators. We include all terms up to second order in the PNGB fields in our analysis, and the form of the hadron-level Lagrangian is as expected from dimensional analysis. Note that it is essential to consider complex-valued g ij andg ij , without which some interactions will vanish.
We can now determine the reheating temperature required to produce the observed dark matter density as a function of our model parameters. First, using the normalization factors as they appear in eq. (65), we can now estimate the relative significance of decays and scattering, starting with eqs. (54) and (56). Assuming that all dimensionless couplings are O(1), we set the coupling λ for 3-point vertices equal to B 0 f π /Λ NP , and we set the coupling for 4-point vertices to B 0 /Λ NP . In this regime, we typically have m i max{m P , T rh }, and in this limit, Our parameter space includes 1 MeV m P 200 MeV, so the ratio above can be large or O(1) depending on the choice of the P mass, but it is never small. Note, however, that increasing m P can also close certain decay channels. In particular, if there exist interactions allowing the decay π 0 → P P , this channel naively dominates production at low temperatures, but is closed for 2m P > m π 0 .
Since decays dominate in most of the parameter space, we can make a first estimate of the yield by considering only production via K L → SP , the same decay process which is necessary to account for the KOTO excess. Neglecting the distinction between m S and m P , the yield is and the resulting upper bound on Λ sd is For the typical parameter values selected above, this upper bound is towards the lower edge of our parameter space of interest for the KOTO excess. Thus, although hadronic decays significantly enhance production relative to the prediction of eq. (59), this channel on its own does not pose an obstacle to accounting for the KOTO excess.
However, in general, it is necessary to numerically evaluate the yield to determine the extent of the viable parameter space-and, in particular, to identify the reheating temperature that produces the observed relic density at each parameter point. The resulting reheating temperatures are shown in fig. 8, and are of order 10 MeV throughout the parameter space of interest. The required reheating temperature is mainly controlled by the smaller of Λ sd and Λ dd , with a slight bias towards Λ sd , since production by η decays is suppressed compared to production by K 0 decays due to their relative masses. Note that all couplings except for g SP sd and g SP dd are neglected in fig. 8, so, in particular, π 0 → P P does not contribute to the relic density even when 2m P < m π 0 . If we suppose that all of the couplings in the effective theory are of similar order, the viable parameter space can change significantly.
We can estimate this effect by taking g S 2 q 1 q 2 = g SP q 1 q 2 = g P 2 q 1 q 2 and setting g O The resulting reheating temperatures are shown in fig. 9. With these choices for the couplings, our two benchmark points with m P = 10 MeV are incompatible with freeze-in as a production mechanism, since the required reheating temperature is below observational bounds throughout the relevant parameter space. This is due to the open π 0 → P P decay, which is kinematically closed for the other two benchmark points with m P = 100 MeV and m P = 125 MeV. For these points, the required reheating temperature is again of order 10 MeV throughout the relevant parameter space. At the top-left of the corresponding panel of fig. 9, the required reheating temperature decreases with increasing Λ dd . This is just because of our assumption that g O sd is the geometric mean of g O dd and g O ss : increasing Λ dd corresponds to decreasing g O dd , so if g O sd is held fixed, then g O ss must increase to compensate. This increases the relic density, forcing a lower reheating temperature.
Finally, we note that the reheating temperatures shown in figs. 8 and 9 are potentially imprecise, and should be viewed as lower bounds. Our calculation of the yield assumes that all of the initial-state species are thermalized, but the mesons freeze out at temperatures of the same order considered here. In particular, π 0 , K 0 , and η freeze out at 3 MeV, 10.5 MeV, and 11.6 MeV, respectively. In a scenario with a high reheating temperature, this concern would be less significant: the mesons would have a thermal distribution at early times, so as long as dark matter production is not dominated by temperatures well below the mesons' freeze-out temperatures, the effect should be small. However, we are speculating that the reheating temperature itself is lower than e.g. the kaon freeze-out temperature in parts of our parameter space, in which case the kaons may never be populated with anything resembling a thermal distribution. It is thus possible that eq. (50) overestimates the dark matter relic abundance.

V. DISCUSSION
In the foregoing sections, we have introduced a model to account for the KOTO excess and explored the cosmological effects. We now discuss the implications of our results and future experimental prospects.
If the KOTO excess is interpreted at face value, this suggests apparent violation of the GN bound. As has been discussed by several authors [19, 21, 23, 25, 30, 36-38, 40-42, 52], such a signal at KOTO can be mimicked by a decay of the form K L → π 0 X, where X denotes one or more invisible species. In contrast to most studies, we focus on a new physics by the number density of the parent mesons, and is fairly insensitive to other details of the phase space distribution. decay channels are open, and production is dominated by π 0 decays. In the middle panel (BM2), π 0 → P P is closed, but S → 3P is still open. In the rightmost panel (BM3), both π 0 → P P and S → 3P are closed, so S decays only via S → π 0 P . In the leftmost panel, since production is dominated by π 0 → P P , the relic abundance is controlled exclusively by Λ dd . In this case, the required reheating temperatures are observationally inviable throughout the parameter space. In the other two panels, production is dominated by K 0 and η decays, and their relative importance depends on Λ sd and Λ dd . scenario where the decay K L → π 0 inv. is realized through a sequence of two-body decays K L → SP → π 0 P P , where S and P are light neutral scalar particles. Similar scenarios were also studied in [37] where the light particles interact with the SM through a vector or scalar portal. Here we instead analyze a setup where S and P are coupled to the SM through effective operators at a characteristic new physics scale of Λ NP ∼ 10 6 -10 9 GeV. We have stabilized P with a Z 2 symmetry under which SM species are even and our new species are odd, and we have entertained the possibility of other interactions consistent with such Z 2 invariance, including an SP 3 term that could mediate the decay of S → 3P . Our effective theory is readily UV-completed by e.g. very heavy vector-like quarks or a TeV-scale inert Higgs doublet. Such UV completions can realize a minimal case in which only interactions between SM quarks and SP are present at low energies, as well as more generic cases that include interactions with S 2 and P 2 .
If the KOTO excess persists, the GN bound heavily constrains new physics interpretations. A model of the type we consider, with new light scalars, is one of the simplest and most elegant solutions. Since the scale Λ NP ∼ 10 6 -10 9 GeV indicated by the KOTO excess is so large, most other experiments are not substantially constraining (with the notable exception of beam dump experiments, to which we will return shortly). In particular, in our scenario, there is a large region of parameter space which can account for the KOTO excess while still unconstrained by other rare meson decays. However, it is important to consider astrophysical constraints. Supernova cooling limits can potentially rule out lower P masses: as discussed in section III A, supernova temperatures are high enough, at tens of MeV, to probe the lightest S and P masses that we consider in fig. 2. These constraints are most significant for Λ dd 10 6 GeV, and it is important to note that establishing firm constraints from supernova cooling requires a much more detailed analysis beyond the scope of this work. However, the simplistic expectation is that P masses of O(10 MeV) and below are disfavored, making our scenario easier to test.
Since the KOTO excess motivates the introduction of new feebly-coupled particles, it is natural to speculate that these new species might contribute to cosmological dark matterand indeed, we have shown that S and P can constitute all of DM even in the most minimal scenarios needed to explain the KOTO signal. Nevertheless, this comes at a cost: in the absence of additional interactions, there is no mechanism to reduce the DM abundance, and cosmological reheating must take place at very late times, at a temperature of order Such a thermal history is necessary because the effective coupling lies in an intermediate regime: it is too small for freeze-out to deplete the DM abundance, but large enough that UV freeze-in generically overproduces DM. Thus, an additional feature is needed to prevent overproduction. The simplest mechanism to accomplish this, without any modification to the model, is to make a judicious choice of the reheating temperature. Since we are working with an effective theory, the DM relic density is inherently sensitive to the reheating temperature-indeed, if T rh Λ NP , we cannot consistently calculate the relic density, but only bound it below. Thus, since T rh is necessarily a parameter of our model, T rh ∼ 10 MeV is as natural as any other choice. As we have discussed, observational constraints are ineffective at temperatures above ∼ 5 MeV.
We note that in principle low-temperature reheating might leave an imprint on early universe probes such as BBN  Previous studies (see e.g. [35] and references therein) relied on simple assumptions such as a single massive matter species driving reheating, and decaying primarily into neutrinos [20], or electromagnetically-interacting species [33], or hadrons [34]. Generally, testable effects arise for T rh 5 MeV, implying that no signal is expected for the scenario discussed here, where T rh 10 MeV. However, it is important to point out that the reheating scenario might include features that could manifest themselves when more stringent probes of CMB become available in the future [1]. For instance, the field driving reheating might actually be an ensemble of fields, with different masses; the S and P particles might be directly produced in the decay of the field(s) driving reheating, changing the predictions for T rh made above; or new physics in the neutrino sector could make reheating temperatures in the 10s of MeV visible once constraints on N eff significantly improve.
There are other mechanisms which prevent the overproduction of dark matter without requiring a particular temperature for reheating. One possibility is to add an interaction with the SM to restore freeze-out as a viable thermal history, as we discussed briefly in the context of a neutrino portal. This would be a heartening scenario: reheating can still take place at a very high temperature, and the coupling to leptons might allow for additional experimental probes. However, there are several other possibilities. In particular, it is possible that the DM abundance is depleted by additional interactions within the dark sector. This is not possible in our effective theory, but one can consider extensions which keep the DM in thermal equilibrium long after decoupling from the SM bath, or which allow other numberchanging processes at a sufficient rate to allow for freeze-out at high temperatures. We emphasize again that our results imply cosmological constraints on models of the KOTO excess: cosmology requires either a restricted range of reheating temperatures or additional features of the low-energy theory, regardless of what fraction of cosmological DM is composed of P .
Of course, one can also consider constraints which only apply if P makes up a significant fraction of DM. The simplest of these is the Lyman-α constraint on warm DM [51], which requires the P population to be non-relativistic at temperatures of O(keV). If P is produced non-thermally via decays at 10 MeV, typical energies will be of order the masses of the parent states, i.e., O(100 MeV). Thus, in order for P to be non-relativistic when T γ ∼ keV, we require that m P 10 keV. This is a somewhat weaker bound than one expects from supernovae, but it is not subject to the complicated physics involved in such constraints.
The annihilation cross section into visible states is much too small (∼ 10 −50 cm 2 ) for indirect detection to be viable, nor is there any significant self-interaction in the dark sector.
However, the scattering cross section with nuclei could be as large as ∼ 0.1 pb, and thus potentially within reach of future, planned experimental sensitivity for sub-GeV direct dark matter searches. It is thus possible (albeit not guaranteed) that future experiments will probe such signatures associated with our model-particularly direct detection-but it is important to note that in the minimal scenario for the KOTO excess, these signatures are substantially suppressed even compared to the generic expectation. This is because the KOTO excess only requires SM interactions with the current SP , and not P P . Since any DM accounted for by our model is composed entirely of P , this means that any diagrams contributing to indirect detection must be suppressed by Λ −4 NP . Moreover, at lowest order, direct detection is only sensitive to the inelastic scattering process N P → N S, which is kinematically prohibited for non-relativistic DM. It is thus challenging to conclusively establish that P makes up cosmological DM through direct observational means.
However, it is potentially much easier to determine whether a model like ours accounts for the KOTO excess. If the excess persists at its present size, then as KOTO reaches its design sensitivity, hundreds of events will be observed. With a sample of this size, it is possible to distinguish our model from SM three-body decays kinematically in much of our parameter space, simply by measuring the pion's transverse momentum. In fig. 10, we show the transverse momentum distributions expected at KOTO in the SM and in our model. By sampling from these distributions and applying the Kolmogorov-Smirnov test, we find that the p T distribution in our model can be distinguished from the SM three-body decay at 5σ with O(100) events in much of our parameter space. Sensitivity is lost when m P is small and m S ∼ m K L , and the distributions may also be too close to distinguish at smaller m S if the S lifetime is shorter than O(10 cm). Still, there are good prospects for making such a determination within the next several years, as KOTO continues to collect data.
There are also discovery prospects for S particles with meter-and centimeter-scale lifetimes at future beam-dump experiments. In particular, as discussed in section III B, the SeaQuest experiment can probe much shorter lifetimes than those to which CHARM and NuCal are sensitive. Backgrounds are relatively easy to control for experiments of this type, and they remain sensitive even in our minimal scenario. The figure of merit is the S lifetime, which is at least O(cm) in our minimal scenario. This can be reduced by enhancing the SP 3 interaction in our effective theory, but nonetheless, searches for long-lived particles promise to be a powerful probe of our scenario in the coming decade.

VI. CONCLUSIONS
Taken together, the anomalous KOTO events and the Grossman-Nir bound provide a strong hint for light new physics. In this work, we have introduced an effective theory that accounts for the excess in the K L → π 0 inv. channel with a metastable scalar S; a lighter, stable pseudoscalar P ; and effective dimension-5 operators that mediate interactions between S, P and the d and s quarks. We provided two UV-complete models that would produce an effective theory consistent with our assumptions. We then investigated the implications of our effective theory for cosmology and vice versa. In particular, we have shown that cosmological overproduction of P places important constraints on the structure of the lowenergy theory.
At face value, in our minimal scenario, P cannot account for either dark matter or the KOTO excess unless the reheating temperature is close to 10 MeV. While it is possible to escape this conclusion by augmenting the model, e.g. with couplings of P to neutrinos, a low reheating temperature is unavoidable in the model's simplest incarnation. However, unless P is very light, the required reheating temperature is compatible with current constraints from BBN and CMB, possibly even offering an observational handle on the model once CMB Stage IV experiments further probe the effective number of relativistic species.
Finally, we have discussed three experimental tests of our scenario. First, we have shown that portions of our parameter space are within reach of future dark matter direct detection experiments. Second, our metastable S may be discovered by upcoming long-lived particle searches, particularly the planned SeaQuest upgrade. Finally, if P is in our favored mass range, future KOTO data alone can discriminate between our decay chain and the SM threebody decay on the basis of the neutral pion p T distribution. There are thus strong discovery prospects for P dark matter within the next decade.

ACKNOWLEDGMENTS
The research of WA is supported by the National Science Foundation under Grant No.
NSF 1912719. BVL and SP are partly supported by the U.S. Department of Energy grant number de-sc0010107. We thank Maxim Pospelov for introducing us to this class of models for the KOTO excess, and for subsequent discussions. We are grateful to James Unwin for valuable conversations regarding the UV freeze-in paradigm. We thank Stefania Gori for pointing us to relevant beam dump constraints, and we thank Natalie Telis for valuable conversations concerning statistical methodology.

Appendix A: KOTO simulation
In this appendix, we provide details of our calculation of the quantity R introduced in eq. (23). R is the acceptance of the K L → SP → π 0 P P signal relative to the SM K L → π 0 νν acceptance at KOTO. Our calculation is based on a Monte Carlo simulation following steps similar to the ones described in [37,40].
The layout of the KOTO beamline and the KOTO detector is described e.g. in [43]. We start by generating K L momenta, p K L , and K L decay vertex locations, z K L , based on the distribution where z exit = 20 m is the distance of the beam exit from the target and g(p K L ) is the measured K L momentum distribution at the beam exit from [43]. We include a small transverse component of the K L momentum such that the beam profile at the beam exit is constant within an 8.5 cm × 8.5 cm square and zero outside [43].
In the case of the SM decay, we generate pion momenta using the K → π form factor from [16]. In the case of the K L → SP → π 0 P P decay, we first generate momenta for S, based on the fixed energy of S in the K L rest frame, E S = (m 2 K L + m 2 S − m 2 P )/(2m K L ). We then decay S with a decay length distribution that is determined by the S → π 0 P and S → 3P partial widths. The pion momentum is generated based on the known pion energy in the S rest frame, E π 0 = (m 2 S + m 2 π 0 − m 2 P )/(2m S ). Both in the SM case and the NP case, we let the pion decay promptly into two photons, each with energy E γ = m π 0 /2 in the pion rest frame. We reject events with photons produced less than 2.5 m after the front face of the front barrel (which starts 1.507 m after the beam exit), as they would be rejected by photon veto collar counters. All other photons are propagated to the calorimeter located 6.148 m after the front face of the front barrel [43].
The energy and location of the detected photons in the calorimeter is smeared using the parameters given in [48].
Based on the smeared energy and smeared location of the photons in the calorimeter, the transverse momentum and decay vertex location of the pion is inferred following the procedure described in [43]. If there is more than one solution for the vertex location in the decay volume, we pick the location further away from the calorimeter. We then perform the event selection as in [3], taking into account all cuts but timing and shape related cuts and the trigger related cut on the center of energy deposition. We use the updated signal region in the plane of the inferred pion transverse momentum and the pion decay vertex location from [49].
The results for R in our benchmark scenarios are shown in fig. 3 as function of the S lifetime.