Quark Flavor Phenomenology of the QCD Axion

Axion models with generation-dependent Peccei-Quinn charges can lead to flavor-changing neutral currents, thus motivating QCD axion searches at precision flavor experiments. We rigorously derive limits on the most general effective flavor-violating couplings from current measurements and assess their discovery potential. For two-body decays we use available experimental data to derive limits on $q\to q' a$ decay rates for all flavor transitions. Axion contributions to neutral-meson mixing are calculated in a systematic way using chiral perturbation theory and operator product expansion. We also discuss in detail baryonic decays and three-body meson decays, which can lead to the best search strategies for some of the couplings. For instance, a strong limit on the $\Lambda\to n a$ transition can be derived from the supernova SN 1987A. In the near future, dedicated searches for $q\to q' a$ decays at ongoing experiments could potentially test Peccei-Quinn breaking scales up to $10^{12}$ GeV at NA62 or KOTO, and up to $10^{9}$ GeV at Belle II or BES III.

In this paper we reiterate the importance of flavor violating transitions for axion searches. Indeed, a QCD axion with flavor violating couplings could well be first discovered in flavor-physics experiments. While this possibility was already contemplated in the literature for rare meson decays and neutral meson mixing [31][32][33][34], we go well beyond the state of the art. We show that, in light of upcoming experiments, there are a number of additional dedicated searches that could and should be performed. Besides the two-body meson decays, also the three-body and baryonic decays can provide the best sensitivity to specific axion couplings. We provide a careful and rigorous analysis of the resulting constraints by exploiting the entire set of current experimental information. We also improve the predictions for axion induced neutral-meson oscillations by using effective-field theory methods, and derive a new bound from supernova cooling.
Predictions for axion couplings to fermions are model-dependent. The only requirement for a successful axion model is an almost exact global U (1) PQ symmetry that is spontaneously broken and anomalous with respect to QCD. Therefore, the only coupling shared by all axion models is the axion coupling to the CP-odd gluon operator, GG, arising from the QCD anomaly and responsible for the solution to the strong CP problem. Axion models divide into two classes, depending on how the color anomaly arises. In the KSVZ-type models [35,36] the color anomaly is due to a set of heavy fermions that are vectorlike under the SM but chiral under the PQ symmetry. In this case the axion does not couple to elementary SM fermions at tree level. In the DFSZ-type models [37,38], on the other hand, the SM fermions carry PQ charges, and the axion always couples to the fermionic currents. Whether these couplings are flavor conserving or flavor violating is a model-dependent choice.
In our analysis we remain, for the most part, agnostic about the origin of the flavor and chiral structure of axion couplings to SM fermions, and simply treat axion couplings to fermions as independent parameters in an effective Lagrangain. For related studies of axion-like particles with flavor violating couplings, see [68][69][70] (for loop induced transitions see [71][72][73][74][75][76][77][78]). We restrict the analysis to the case of the (practically) massless QCD axion, but our results can be repurposed for any other light scalar or pseudoscalar with flavor violating couplings to the SM fermions, as long as the mass of the (pseudo-)scalar is much smaller than the typical energy release in the flavor transition.
The paper is structured as follows. In Section II we introduce our notation for the axion couplings to fermions and comment on their flavor structure. In Section III we derive the bounds on these couplings from two-body and three-body meson decays, from baryon decays and from baryon transitions in supernovae. Section IV contains bounds from mixing of neutral mesons, Section V reviews bounds on flavor-diagonal couplings, and Section VI discusses axion couplings involving the top quark. Finally, in Section VII we present the results and experimental projections. Details about renormalization of effective axion couplings, experimental recasts of two-body meson decays and hadronic inputs are deferred to the Appendix.

II. AXION COUPLINGS TO FERMIONS
The Lagrangian describing the most general interactions of the axion with the SM fermions is given by 1 (see also Appendix A) where f a is the axion decay constant, c V,A fifj are hermitian matrices in flavor space, and the sum over repeated generational indices, i, j = 1, 2, 3, is implied. For future convenience we define effective decay constants as In general F V,A fifj , i = j, are complex, with F V,A fifj * = F V,A fj fi . Throughout the paper we take a to be the QCD axion, so that its mass is inversely proportional to f a [79], m a = 5.691 (51) µeV For the "invisible" axion the decay constant is f a 10 6 GeV [80], in which case the axion is much lighter than an eV and essentially decoupled from the SM. We will always be working in this limit, so that in the flavor transitions the axion can be taken as massless for all practical purposes. In this mass range the axion has a lifetime that is larger than the age of the universe, and therefore is a suitable DM candidate. If the PQ symmetry is broken before inflation, axions are produced near the QCD phase transition and yield the observed DM abundance for axion decay constants of the order f a ∼ (10 11 ÷ 10 13 ) GeV [13][14][15], assuming natural values of the misalignment angle. Other production mechanisms, e.g., via parametric resonance, allow for axion DM also for smaller decay constants, down to f a ∼ 10 8 GeV [81]. We will see below that precision flavor experiments are able to test this most interesting region of the QCD axion parameter space.
The axion couplings to the SM fermions in the mass basis, c V fifj and c A fifj , are related to the PQ charge matrices in the flavor basis, X f , through (4) where N is the QCD anomaly coefficient of the PQ symmetry. The unitary rotations V f L ,f R diagonalize the appropriate SM fermion yukawa matri- , for the "up" and "down" quark flavors, f = u, d. We focus on axion couplings to quarks, and refer the reader to Ref. [82] for present and future prospects for testing lepton flavor violating axion couplings. Off-diagonal couplings arise whenever PQ charges, X q L , X f R , are not diagonal in the same basis as the yukawa matrices, y f . Their sizes depend on the misalignment between the two bases, parametrized by the unitary rotations V f L , V f R (taking X q L , X f R to be diagonal).
Very different flavor textures of c V,A fifj are possible. Provided a suitable set of PQ charges and appropriate flavor structures of the SM yukawa matrices, it is possible for just a single off-diagonal coupling to be large c V,A fifj ∼ O(1), i = j, with all the other offdiagonal couplings zero or very small. For example, one can realize a situation where c V bs = c A bs ∼ 1 and all the other c V,A fi =fj = 0, by taking X q L = X u R = 1, 3 , with down yukawa matrix y d such that the only non-vanishing rotation is in the 2-3 RH sector, s Rd 23 ∼ 1. Moreover, while one would generically expect axial couplings c A fifj and vector couplings c V fifj to be of the same order, the latter can be suppressed in a situation where X f R = −X q L and V f R = V f L , which can arise in models where PQ charges are compatible with a grand unified structure (see Ref. [83] for a recent example in SO(10)), and yukawas are hermitian, positive definite matrices (see e.g. Ref. [84] for a realization of this scenario in SO (10)).
In the absence of a theory of flavor, we will be agnostic about the origin of the possible flavor misalignment, and simply take c V,A fifj to be unknown parameters in an effective Lagrangian, which will be constrained solely from data.

III. BOUNDS FROM HADRON DECAYS
Bounds on the vector and axial-vector parts of the flavor-violating axion couplings, Eqs. (1), (2), can be derived from searches for hadron decays with missing energy. In this section we consider the bounds from two-body decays of pseudoscalar mesons to pseudoscalar and vector mesons respectively, from three-body meson decays, and from decays of baryons. In each case, we first review the available and planned experimental measurements and then interpret them in terms of the bounds on flavor violating axion couplings.

A. Bounds from two-body meson decays
Due to parity conservation of strong interactions, the P 1 → P 2 a decays of a pseudoscalar meson P 1 to a pseudoscalar meson P 2 are only sensitive to the vector couplings of the axion, while the P 1 → V 2 a decays, where V 2 is the vector meson, are only sensitive to the axial-vector couplings (see Appendix C). Searches targeting specifically the massless axion were performed in K + → π + a [85], B + → K + a [86] and B + → π + a [86] decays. In addition, searches for SM decays where the invisible final state is a νν pair can be recast to derive limits on the axion couplings. This requires that the two-body kinematics of an (essentially) massless axion is included in the search region, as was the case in the BaBar and CLEO searches for B → K ( * ) νν [89], B → πνν [88] and D → (τ → πν)ν [87]. Note that the corresponding Belle data sets analyzed in Refs. [90,92] cannot be readily used to set bounds on axion couplings, because the analyses either cut out two-body decays with a massless axion [92] or used multi-variate methods [90] that are difficult to re-interpret for different purposes [93] (see also Ref. [94]).
The available experimental information is summarized in Table I, where we give the 90% CL upper limits on the branching ratios for decays involv-ing neutrinos or invisible massless axions. We include the limits on the decays involving axions that we obtained by recasting the experimental searches for decays involving neutrinos ("recast"). Table I shows bounds on the branching ratios for decays to pseudoscalar mesons, P 1 → P 2 a, for K + → π + a (s → d transition, experimental analysis in Ref. [85]), D + → π + a (c → u transition, our recast of Ref. [87]), B + → π + a (b → d transition, experimental analysis in Ref. [86] and our recast of Ref. [88]), B + → K + a (b → s transition, experimental analysis in Ref. [86] and our recast of Ref. [89]). For decays to vector mesons, P 1 → V 2 a, there is experimental information on the B → K * a decay (b → s transition, our recast of Ref. [89]). In the same section of Table I we also include the bounds on K + → π + π 0 a decay used in Sec. III B below (s → d transition, experimental analysis in Ref. [91]). For details on the recast see Appendix B.
The above analyses could be improved with dedicated axion searches applied to existing data or to ongoing experiments. Sensitivity to the K + → π + a decay better than the present world best limit can be achieved at the NA62 experiment. For a massless axion, an improvement by an order of magnitude compared to the BNL result, BR(K + → π + a) < 7.3 × 10 −11 [85], is expected by 2025 [95]. The same flavor transition is probed by KOTO searching for the neutral decay mode K L → π 0 a. The current limit using data collected in 2015 is BR(K L → π 0 a) < 2 × 10 −9 and is in the same ballpark as for the decay to the neutrino pair [96]. The KOTO collaboration expects the sensitivity to be improved down to the 10 −11 level [97]. Improved sensitivity in the neutral-kaon mode can also be expected from the proposed KLEVER experiment [98].
To the best of our knowledge, the future prospects for the heavy-meson axion decays P 1 → P 2 a or P 1 → V 2 a have not yet been estimated by the experimental collaborations. Nevertheless, it is clear that improvements over the present situation are justifiably expected. For instance, the experimental error on the D + → τ + ν branching ratio is projected to be reduced by a factor of 3 with 20 fb −1 integrated luminosity at BESIII [99] compared to the present value [100], indicating a potential significant improvement in sensitivity to the c → ua transition. The reach on the branching ratios for the axionic decay modes of B mesons will be improved with the amount of data expected to be collected at Belle II, which is roughly a factor of 50 larger compared to the BaBar and Belle samples.
Several potentially interesting channels are lacking any experimental analyses so far. For example, there is no experimental analysis of c → ua transitions that are sensitive to the axial-vector coupling, i.e., there are no D → ππX inv or D → ρX inv , X inv = νν, a, searches. One could also search for a c → ua signal in D s → Ka, D s → K * a decays, all of which could be performed at Belle II and BESIII.
Potentially, LHCb could also probe these couplings using decay chains, such as B − → D 0 π − followed by D 0 → ρ 0 a, which results in three charged pions + MET and two displaced vertices. The lack of such analyses means that there is at present no bound from meson decays on axial cu couplings to the axion. Similarly, there is at present no publicly available experimental analysis that bounds the B → ρa decays (as discussed above, one cannot readily use for that purpose the B → ρνν Belle data from Ref. [90], while BaBar has not performed such an analysis). Finally, our recast bounds on B → K ( * ) a, B → πa could be easily improved by dedicated experimental searches using already collected data. At LHCb one could measure the B → K * a and B → ρa branching ratios using the decay chains such as B 0 * * s 101]. One could also attempt more challenging decay chain measurements such as B * s → B s γ, followed by B s → φa or B s → K * a.
We now convert the bounds on the branching ratios in Table I to bounds on flavor violating couplings of axions to quarks, Eqs. (1), (2). The corresponding partial decay widths are given by with the kinematic prefactor where M 1 (M 2 ) is the mass of the parent (daughter) meson. Since K L → π 0 a decay is CP violating, the partial decay width in that case is given by and thus vanishes in the CP conserving limit, Im F V sd = 0, cf. Eq. (2). The K L → π 0 a and K + → π + a decay rates obey the Grossman-Nir bound BR(K L → π 0 a) ≤ 4.3 BR(K + → π + a) [102,103].
The form factors f + (q 2 ) and A 0 (q 2 ) are defined in Appendix C, where we also collect the numerical values used as inputs in the numerical analysis. The resulting bounds on axion couplings F V,A ij are shown in Tab. III. The implications of these results and future projections will be discussed in Sec. VII.

B. Bounds from three-body meson decays
The E787 experiment at Brookhaven performed a search for the three-body K + → π 0 π + a decay mediated by the s → da transition, and set the bound BR(K + → π 0 π + a) ≤ 3.8 × 10 −5 at 90% CL [91]. The related decay mode K L → π 0 π 0 a has also been searched for, resulting in the upper limit for light massive axions BR(K L → π 0 π 0 a) 0.7×10 −6 [104]. However, this analysis excluded the m a = 0 kinematic region and is thus not applicable to the case of the QCD axion [105]. Other decay modes such as K L → π + π − a or those involving the decays of K S have not been investigated experimentally.
Parity conservation implies that the K → ππa decays are sensitive only to the axial-vector couplings of the axion to quarks (see Appendix C). The form factors entering the predictions are related via isospin symmetry to the form factors measured in K + → π + π − e + ν [106][107][108][109], making precise predictions for K → ππa decay rates possible. The two final state pions can be only in the total isospin I = 0 or the I = 1 state, since the s → da Lagrangian is |∆I| = 1/2, while the initial kaon is part of an isodoublet. Bose symmetry demands the decay amplitude to be symmetric with respect to the exchange of the two pions. The I = 0 (I = 1) amplitude is even (odd) under this permutation. The form factors must therefore enter in combinations which are even (odd) with respect to the exchange of pion momenta, p π1 ↔ p π2 . The two pions in the decay K 0 → π 0 π 0 a (K + → π + π 0 a) are in a pure I = 0 (I = 1) state and one obtains, and where s = (p π1 + p π2 ) 2 and β 2 = 1 − 4m 2 π /s. The functions F s , F p and G p correspond to the moduli of the coefficients in the partial-wave expansion of the form factors (which are complex functions of the kinematic variables of the two-pion system) in the K + → π + π − e + ν decay channel [106,107,110,111] (see Appendix C). In the first equation we also used the fact that the final state is CP-odd and, therefore, the decay occurs through the predominantly CPodd component of K L .
One can also obtain expressions for other decay modes from eqs. (8) and (9). The amplitude of K L → π + π − a contains both I = 0 and I = 1 isospin components and the decay rate in the isospin limit is, whereΓ(K + → π + π 0 a) is obtained by replacing 1/|F A sd | → Re (1/F A sd ) in Γ(K + → π + π 0 a). Rates for the K S decays are obtained by replacing Re (1/F A sd ) → Im (1/F A sd ) in the rates of K L . Better sensitivity to the axial sd axion coupling can be achieved by searching for K L → π 0 π 0 a at KOTO [105], while a search for K + → π + π 0 a could be attempted at NA62. Note that eqs. (8) and (9) can be combined to the inequality BR(K L → π 0 π 0 a) ≤ 31 BR(K + → π + π 0 a), (11) which, besides the ratio of kaon lifetimes τ L /τ + = 4.1 (that gives the main effect in the Grossman-Nir bound for BR(K → πa)), also includes the effects of different combinations of Clebsch-Gordan coefficients, form factors and phase space factors.
The numerical input values for the hadronic matrix elements are described in Appendix C, while the resulting bound on the axion coupling obtained from the current bound on K + → π + π 0 a [91], is included in Tab. III. The implications of these results and prospects are discussed further in Sec. VII.
Other searches using three body decays could be of interest. Decays such as B + → ρ 0 π + a, B 0 → ρ 0 ρ 0 a or B s → K S ρ 0 a, B s → K * 0 ρ 0 a, B s → φρ 0 a could potentially be even attempted at LHCb, since they result in three or four charged particles and a massless invisible particle. Other possible decays of theoretical interest, but probably only measurable at Belle II, are B + → π 0 π + a, B 0 → π + π − a, B 0 → ρ + ρ − a, B + → ρ + ρ 0 a. Belle II could also access from the Υ(5S) run the B s decays with neutral pions such as B s → K S π 0 a, B s → K * 0 π 0 a, etc. Providing predictions for these decays lies beyond the scope of the present paper, though controlled calculations using QCD factorization and soft-collinear effective theory may be possible [112] and could be attempted in the future. Until then one can use as a rough guide the bounds that were obtained for the related two body decays, i.e., B + → π + a (on F V bd ) and B + → ρ + a (on F A bd ), as a ball-park figure for what an interesting experimental reach for B + → ρ 0 π + a (bounding a combination of F V bd and F A bd ) might be. Finally one can also use LHCb di-muon data collected for the B q → µµ analysis to constrain the axial couplings F A bd and F A bs . As long as no vetos on extra particles in the event are applied in the LHCb analysis, their datasets can be used to constrain decays with additional particles in the final state such as axions. In Ref. [70] the present data were used to derive constraints on F A bq of the order of 10 5 GeV, cf. Table III. With 300 fb −1 the bounds can be strengthened by about an order of magnitude, and could be further improved by using also the ATLAS and CMS data on B q → µµ. The same strategy might also be applied to sd transitions using K S → µµ decays, as proposed in Ref. [113].

C. Bounds from Baryon Decays
Baryon decays, B 1 → B 2 a, are sensitive to both axial and vector couplings of the axion. The decay rates are given by with the kinematic prefactor κ 12 given in Eq. (6).
The form factors f 1 (q 2 ) and g 1 (q 2 ) are discussed in detail in Appendix C. At present there are no published experimental searches for B 1 → B 2 a decays. We therefore set 90% CL upper limits on BR(B 1 → B 2 a) indirectly. For hyperons, we subtract from unity the branching ratios for all channels that have been measured so far, adding the experimental errors in quadrature [10]. For Λ b decays we use the SM prediction for its lifetime at NLO in α s and at O(1/m b ) in heavy quark expansion [114], compare it to the experimental measurements [10], and ascribe the difference to the allowed value for BR(Λ b → B 2 a). For Λ c we saturate the observed lifetime with the Λ + c → pa decay width. The resulting upper bounds on the branching ratios BR(B 1 → B 2 a) are collected in Tab. II and used to derive the limits on axion couplings F V,A ij shown in Tab. III.
The sensitivity could be improved substantially with dedicated searches for B 1 → B 2 a decays. The BESIII collaboration plans to measure B 1 → B 2 νν decays of hyperons by using a sample of hyperonantihyperon pairs collected at the e + e − → J/ψ peak [115]. At BESIII one could also study the decays of charmed baryons into axions. Namely, approximately ∼ 10 4 Λ + c Λ − c pairs have been collected in a run with 567 pb −1 of integrated luminosity at e + e − collisions just above the pair-production threshold [116]. Note that bottom baryons are not produced in the modern e + e − machines, since they run at energies below the corresponding pairproduction thresholds, so that only LHCb, while challenging, could have access to these decays.

D. Supernova bound
In the core of neutron stars (NS) hyperons coexist in equilibrium with neutrons, protons and electrons [117][118][119][120]. The decay Λ → na would represent a new cooling mechanism for NS, and can thus be constrained by stellar structure calculations and observations. At exactly zero temperature, the degenerate Λ and neutron distributions must have the same Fermi energy, leaving no phase space for the Λ → na decays to occur. The degeneracy is partially lifted at finite temperature allowing for the Λ → na transitions with a rate that increases with the temperature. The impact of this new cooling mechanism is maximal during the few seconds after the supernova explosion, when a proto-neutron star (PNS) reaches temperatures of several tens of MeV [121,122].
In order to estimate the cooling facilitated by the sd-axion interaction in this early phase of the supernova evolution we assume that the PNS is a system of non-interacting (finite temperature) Fermi gases of neutrons, protons, electrons and Λ baryons that are in thermal and chemical equilibrium. Furthermore, we assume that the neutrinos are trapped inside the PNS, while the lepton fraction number, relative to baryon number density, is taken to be Y L = 0.3 [123]. The occupancy of Λ states is distributed according to the Fermi distribution f Λ p p p = 1/ 1 + exp EΛ−µΛ T where p p p is the Λ threemomentum in the star's rest frame, E Λ its energy, E 2 Λ = p p p 2 + m 2 Λ , and µ Λ its chemical potential. Neutrons are distributed following an analogous distribution, f n p p p , characterized by µ n and labelled by the corresponding neutron three-momentum p p p , also in the star's frame. Anti-particles follow identical distributions with the replacement µ → −µ, so that for the temperatures expected in a PNS the densities of Λ and n are negligible.
The volume emission rate Q inside the PNS due to the process Λ → na is given by, where p max (p min ) is the maximal (minimal) neutron momentum in the Λ → na decay, if Λ has momentum p = |p p p| (all in the PNS's rest frame) 2 . Notice that in the non-relativistic limit where p, p m Λ ∼ m n , and in the limit of no Fermi blocking of the final state neutrons, this formula reduces to a more familiar form, where n n is the number density of neutrons. Evaluating the distributions (chemical potentials) for benchmark conditions of T = 30 MeV and nuclear density ρ = ρ nuc , using Eq. (13), we obtain for the energy loss per unit mass = Q/ρ, (15) Setting as the maximal limit on the energy lost through neutrino emission one second after the collapse of the supernova SN 1987A, 10 19 erg/s g [123,124], one obtains bounds on |F A sd | and |F V sd | in the range 10 9 -10 10 GeV. Our estimates are afflicted by significant uncertainties. Nuclear interactions induce important corrections in the calculation of the number densities [118,120] and there are considerable stellar uncertainties stemming from the complex physics at work in the supernova. Note that the energy loss per unit mass obtained using the approximate formula in Eq. (14) is independent of the structural details of the PNS, except for the temperature. At T = 30 MeV this leads to an emission rate that is ∼ 40% larger than in Eq. (15). More than anything, the emission rate suffers from the uncertainty in the temperature of the central region. Variation of this quantity from 20 to 40 MeV changes Q by two orders of magnitude. Finally, our bound crucially relies on the validity of the standard scenario for the SN explosion as applied to SN 1987A, which was disputed in a recent publication [125].

IV. BOUNDS FROM MESON MIXING
The exchanges of axions with flavor violating couplings contribute to ∆F = 2 transitions and can 2 In the star's rest frame Λ is moving in the direction ofp p p.
The maximum (minimum) three-momentum of the neutron in the PNS's rest frame is reached when the neutron recoils in the Λ's rest frame in the direction (in the direction opposite) top p p. modify meson mixing rates from the SM predictions. The contribution from axion exchanges to the mixing amplitude of the P 0 − P 0 neutral meson system is given by the time-ordered correlator where L af f (x) is the axion-fermion interaction Lagrangian in Eq. (1). In this section it will prove useful to use the following form of the axion-fermion interaction Lagrangian, which is obtained from Eq. (1) with the axiondependent field transformation Notice that O(a 2 ) terms that also appear in the transformation from Eq. (1) to Eq. (17) do not affect ∆F = 2 processes, and are omitted for that reason. The above form of L af f simplifies somewhat the calculations of the non-local meson mixing matrix elements, Eq. (16). For this we utilize the appropriate effective field theories; we use Chiral Perturbation Theory (ChPT) for contributions to K − K mixing, while for the heavy-quark systems, B d,s − B d,s and D − D, we use the Operator Product Expansion (OPE), matching onto local four-quark operators describing the meson mixing. In the following we use the relativistic normalization of states, P 0 (k)|P 0 (k ) = 2Eδ( k − k ), and the phase convention Since m a 1 GeV we can use ChPT to describe contributions from axion exchanges in K − K mixing. For axial couplings, c A fifj , the leading contributions arise at tree level, while for vector couplings, c V fifj , the first nonzero contributions are at one loop.
To construct the ChPT Lagrangian in the presence of flavor violating axions we use a spurion analysis [126,127]. In terms of the (pseudo)scalar interactions, the Lagrangian for QCD with a flavor violating axion can be conveniently written as where we keep only light quarks, q = (u, d, s). The diagonal mass matrix is M q = diag (m u , m d , m s ), while χ S,P are 3 × 3 Hermitian matrices describing the quark-axion couplings, The off-diagonal couplings in (20), (21) induce kaon oscillations. The mixing matrix element M 12 follows from a double insertion of the interaction La- The Lagrangian for QCD with the flavor violating axion, L QCD+a , is formally invariant under SU (3) R × SU (3) L transformations, q R,L → g R,L (x)q R,L , if aχ S,P and M q are promoted to spurions that transform as and that take the values We also define the spurion χ = χ S + iχ P that does not contain the axion field, and which transforms similarly as χ → g R χ g † L . The identification of this symmetry structure allows one to build the ChPT Lagrangian, including the chiral-symmetry breaking terms. The introduction of the spurion χ is needed because the axion cannot be treated merely as an external field and enters in the chiral loops with two insertions of χ S,P . Thus, the ChPT Lagrangian is also invariant under a Z 2 symmetry that transforms χ → −χ, with all the other fields taken to be Z 2 even.
Up to overall normalization factors, the axion induced contributions to the K − K mixing amplitude have the scalings 2m K M 12 ∼ (p/Λ χ ) ν (1/F ) ν F . The integer ν characterizes the usual chiral scaling [126,127] where the derivatives of meson fields (and the axion) count as O(p), the quark masses as O(p 2 ) and where the UV cut-off of ChPT is Λ χ 4πf π ∼ O(1 GeV) (f π is the pion decay constant). On the other hand, ν F counts the number of 1/F V,A ds insertions in the amplitude. Thus, the chiral counting of the spurions is M q ∼ O(p 2 ) and ν F = 0 and χ S,P ∼ O(p 2 ) and ν F = 1. In the following we use ChPT to calculate the leading axion-exchange contributions to the K − K mixing amplitude including corrections up to NLO corrections in the chiral counting, ν ≤ 4. This requires two insertions of 1/F V,A ds and thus ν F = 2. The LO ChPT Lagrangian, including a as the light degree of freedom, is given by while the relevant terms in the NLO ChPT Lagrangian are Here U (x) = exp(iλ a π a /f ) is the unitary matrix parametrizing the meson fields [126,127], B 0 is a constant related to the quark condensate, B 0 (µ = 2 GeV) = 2.666(57) GeV, f is related to the pion decay constant f f π / √ 2 = 92.2(1) MeV [128], with normalization 0|uγ µ d(0)|π − (p) = ip µ f π , and α 0,1 ∼ f 2 /Λ 2 χ are the (unknown) low energy constants.
Expanding (24) in meson fields gives Here we kept only the terms relevant for K − K mixing, and replaced √ 2f with the kaon decay constant, f K = 155.6 ± 0.4 MeV [10], thus capturing part of the SU(3) breaking that corrects our results at higher orders in the ChPT expansion [126]. We also traded the products B 0 m q for the meson masses squared [126,127,129]. The two terms in (25), expanded in meson fields, are where again we only keep the terms relevant for K − K mixing. The axion exchange contributions to the K − K mixing amplitude due to axial vector couplings are proportional to (1/F A ds ) 2 and are, up to O(p 4 ) in the chiral counting, given by The first term in the parenthesis is due to the tree level axion exchange, Fig. 1 a), and is induced by the first term in the expanded LO ChPT Lagrangian, Eq. (26). The NLO correction is due to the loop diagram in Fig. 1 b), induced by the last term in Eq. (26). We use dimensional regularization in the MS scheme and fix the renormalization scale to µ = m K in order to minimize the size of the chiral logarithm. The counter-terms in Fig. 1 d) which cancel the µ dependence of the loops are provided by the two terms in the expanded O(p 4 ) Lagrangian, Eq. (27). While the numerical values of the low energy constants α 0,1 are not known, we can estimate their sizes by varying µ in the loop contribution around its nominal value by a factor of 2, while artificially setting α 0,1 to zero. This estimates the O(p 4 ) contribution in (28) to be (17% ± 23%) of the O(p 2 ) one. The vector couplings of the axions first contribute at O(p 4 ) through the one loop diagram in Fig. 1 c) with the counterterms in Fig. 1 d), resulting in Above, we have re-arranged the factors of f K and m K to make the dependence on (1/F V ds ) 2 as well as the structure of the chiral corrections more transparent. The two-point loop function is The Γ V 12 receives a contribution from the discontinuity, Im I 0 (z π ), i.e., from the on-shell part of the diagram Fig. 1 c) with pion and axion running in the loop. The choice µ = m K again minimizes the log. However, since the low energy constants α 0,1 are unknown the predicted M V 12 is quite uncertain. In the numerical estimates we use α 0,1 = 0 and assign 100% uncertainty to the resulting estimate for M V 12 . As we will see in the next subsection, for heavy quarks both s-channel and t-channel exchanges of axions lead to contributions that are parametrically of similar size. However, this is not the case for light quarks. The above ChPT analysis implies that the tree level s-channel axion exchange (necessarily proportional to 1/(F A ds ) 2 ) is leading in the chiral expansion, while the t−channel contribution is subleading.
Note that in addition to the contributions to K − K mixing from axion exchange, there can be other contributions from UV physics which are parametrically of the same order, In the numerical analysis we set these UV model dependent contributions to zero, keeping in mind that their presence can modify our numerical results.
Numerically, Eqs. (28) and (29) give for the contribution of the axion to the neutral-kaon mass difference, where we use the relation ∆m K = 2ReM 12 and have set α 0 = α 1 = 0. The prefactor of the vector couplings carries an O(100%) relative uncertainty because of the unknown contributions of α 0,1 coefficients, entering at the same order in ChPT as the loop contribution. The prediction of ∆m K in the SM has large uncertainties stemming from longdistance contributions. Therefore, in order to obtain the bounds on the axion couplings from this observable we conservatively saturate the experimental value ∆m expt. K = 3.484(6)×10 −12 MeV [10] with the axion-exchange contribution. Assuming α 0,1 = 0 this leads to |F A ds | > 2.0·10 6 GeV and |F V ds | > 5.1·10 5 GeV at 90% C.L., for the case where 1/(F A,V ds ) 2 are real. These are the bounds quoted in Tab. III. Taking into account the estimate for the range of values for α 0,1 reduces the bound on |F A ds | by about 10%. Note that without fixing the values of α 0,1 there is no bound on |F V ds |. Allowing for large cancellations up to 1% between the loop diagram and the counterterm contributions relaxes the bound on |F V ds | by an order of magnitude.
To obtain the bounds on non-SM CP violating contributions to K − K mixing we use the normalized quantity For the theoretical prediction of K we use the expression [130] K = e iφ sin φ where ξ Im Γ 12 ∆Γ K .
We take the values for ∆m K = m L − m S , ∆Γ K = Γ S − Γ L , and φ = arctan(2∆m K /∆Γ K ) from experiment [10]. With the SM prediction for | K | from [131], and the axion contributions to M 12 , Γ 12 from Eqs. (28), (29) we get where in the numerical expressions we set the unknown low energy constants to zero, α 0,1 → 0, with the quoted errors our estimates of the resulting errors due to this approximation 3 . The global CKM fit by the UTFit collaboration obtains 0.87 < C K < 1.39 at 95% CL [132,133]. Assuming α 0,1 = 0 this translates into the 90% CL bounds |F A ds | > 4.4(7.7) · 10 7 GeV, |F V ds | > 0.9(1.5) · 10 6 GeV for purely imaginary and positive In Tab. III we quote the bounds from the less stringent case of CP violating contributions that positively interfere with the SM contributions to K . These bounds will improve in the future, once the improved prediction for K [131] is implemented in the global CKM fits.

B. Heavy-meson mixing
In the mixing of neutral heavy mesons, B s,d −B s,d or D − D, a large momentum, of the order of the heavy quark mass, p ∼ m Q Λ QCD , is injected through L af f , Eq. (16), to an intermediate set of states of the form a + hadrons. This allows one to perform an OPE in x ∼ 1/m Q and express the bi-local operator in terms of the local ones, The sum runs over different dimensions and possible structures of the local operators. The mass matrix element leading to the meson mixing is At LO in the 1/m Q expansion the local operators (35) are of dimension-six. One may have therefore naively expected the Wilson coefficients to scale as C i ∝ 1/m 2 Q . However, the axion couplings to quarks are ∝ m Q /f a , cf. Eq. (17), leading to C i ∝ 1/f 2 a . Note also that we are calculating the OPE at one single kinematic point over the physical cut of the a + hadrons intermediate state.
We assume the mass of the heavy-quark to be large enough so that the energies involved correspond to the perturbative regime of QCD and that the violations of quark-hadron duality are small.
We first present the results for the B 0 −B 0 system, and then extend the results to B 0 s − B 0 s and D 0 − D 0 systems. We work at LO in 1/m Q . The full basis of local operators at this order is given by [134], along with the operatorsÕ 1,2,3 obtained by replacing P L → P R in O 1,2,3 . The summation over color indices, α, β, is implied. The operator basis for B s − B s mixing is obtained from the above by replacing d → s, and for D − D mixing by replacing b → c, d → u.
The Wilson coefficients C i are most easily obtained by matching both sides of Eq. (35) for axionmediated bd → bd scattering with on-shell quarks in the initial and final state, see Fig. 2. The axion can be exchanged in s-and t-channels. In both cases the axion is far off-shell, Λ QCD , justifying the application of the OPE. Note that for the virtuality of the axion in the tchannel we are using the fact that in the heavyquark limit p B 0 = p b and p B 0 = p b .
The matching at O(α 0 S ) leads to the following nonzero Wilson coefficients, and similarly for the B s system, with d → s. The matching gets corrected at O(α s ) due to hard-gluon contributions to the matching, and at O(Λ QCD /m b ) from corrections to the heavy-quark limit. The matrix elements of the operators in Eq. (37) between B 0 and B 0 states are given by where f B is the B meson decay constant, and B i the appropriate bag parameter. Both of these are well known and have been been calculated using lattice QCD [135][136][137][138]. Parity conservation of strong interactions implies Õ i = O i so that the contributions to ∆m B from the ±1/(F A db F V db ) terms in (38) and (39) cancel. Using the lattice results from Ref. [138] we find where ∆m B (s) = 2|M (bs) 12 | and with the dominant theoretical uncertainty due to power-corrections entering at Λ QCD /m b ∼ 0.1. The reduced sensitivity to the vectorial couplings in this formula was anticipated already in the vacuum-insertion approximation estimate [32]. The SM predictions for ∆m B d,s are consistent, within ∼ 10%, with the measured values ∆m B d = 3.354(22) × 10 −10 MeV and ∆m Bs = 1.1688(14) × 10 −8 MeV [10]. Comparison with our predictions immediately shows that we can expect the bounds on F V,A sb,db at the level of 10 5 to 10 6 GeV.
To derive the precise bounds on the allowed axionic contributions we consider simultaneously both the contributions to ∆m B d,s as well as to the mixing phase. We thus define In the SM C Bq = 1 and φ Bq = 0. From the global fit to CKM observables, including B q − B q mixing observables, the UTFit collaboration ob- The SM predictions are obtained using the inputs in Table IV and ref. [138], and the results for the CKM matrix elements of the "New-Physics fit" of the UTfit collaboration (Summer 2018) [132,133]. This leads to the 90% C.L. bounds and |F A sb | > 4.0(7.7) · 10 5 GeV, when the weak phase of F A,V qb is aligned with (differs by ±π/2 from) the SM contribution. The first choice corresponds to MFV like couplings of the axion, where the CPV phase is the SM one, while the second choice corresponds to generic couplings with new weak phase that maximizes the axion exchange contribution to the meson mixing phase. In deriving the bounds above we chose in each case the sign of the NP contribution that leads to the weakest bound. For each of the bounds we also assume that the axion only has axial or vector couplings. These bounds are also collected in Tab. III.
Extending the above OPE results to the D 0 − D 0 system could be problematic since the violations of quark-hadron duality in the a + hadrons system at √ s m D 0 may be large. Extending naively our results we obtain for the axion contribution to the mass difference In the last line we used the lattice QCD results for the bag parameters from [139] (see also [140,141]).
To obtain a bound on |F A,V uc | we saturate the experimental value of ∆m D with the axion contribution because the SM prediction is poorly known.
A more stringent bound is obtained for CP violating axionic contributions from an experimental bound on the D − D mixing phase [142] where in the most commonly used phase convention The axion contributes at tree level to M 12 and only at loop level to Γ 12 . For present experimental bounds the SM contributions to φ 12 are negligible, i.e., φ 12 at present experimental levels would be induced entirely by the axion exchanges, and thus x Γ .
(50) Above we shortened M cu 12 → M 12 , while M a 12 is the mixing matrix element due to the axion exchange.
Comparison with the experiment, the HFLAV Moriond 2019 average φ 12 = −(0.25 ± 0.97) • [143], and PDG value for mass difference ∆m D = (95 ± 43) · 10 8 s −1 , gives at 90% C.L., At the end of LHCb Upgrade II the experimental constraints are expected to reach the parametric sizes of the SM contributions to the two mixing phases, φ M 2 ∼ φ Γ 2 ∼ 10 −3 . For the projections of future sensitivities we thus still use the projected 90% CL bound on |φ M 2 | < 2.0 · 10 −3 (approximate universality fit projection in [142]), assuming that the axion saturates the upper bound, i.e., we assume no cancellations with the poorly known SM contribution. This gives for the expected future sensitivities |F V cu | > 7.8 · 10 7 GeV, |F A cu | > 1.4 · 10 8 GeV, while the bounds from ∆m D do not change.
Finally, we reiterate that the above mixing bounds on F V,A qq assume that the axion contribution is not cancelled by the UV contributions from heavy particles present in the UV theory, even though the latter are expected to be parametrically of the same size.

V. BOUNDS ON FLAVOR-CONSERVING AXION COUPLINGS
For completeness we briefly review the constraints on flavor-conserving axion couplings that are dominated by astrophysical bounds from star cooling. These constrain axion couplings to photons, electrons and nucleons, defined by the Lagrangian For a wide range of axion masses, the strongest bound on axion coupling to photons F γ ≡ f a /C γ ≥ 1.8 × 10 7 GeV (95% CL) is set both by the CAST experiment [144] and the evolution of Horizontal Branch (HB) stars in globular clusters [145]. The CAST successor, IAXO, is expected to improve this bound by about an order of magnitude [146,147]. For restricted ranges of DM axion masses close to m a ∼ 3 µeV the ADMX experiment probes the axion-photon couplings up to F γ 10 12 GeV [148]. Note that the photon coupling C γ = E/N − 1.92 (4) can be suppressed only by tuning the ratio of EM and color anomaly coefficients, and thus is expected to be O(1) in the bulk of UV axion models. Axion couplings to electrons are constrained from star cooling, specifically from their impact on the luminosity function of White Dwarfs (WD) as assessed in Ref. [149], giving F e ≡ 2f a /C e ≥ 4.9 (4.6)× 10 9 GeV at 90 (95)% CL. Interestingly, there are hints of anomalous energy loss in stars that may be explained by axions with non-zero couplings to electrons (and possibly photons) of the order of F e ≈ 7 × 10 9 GeV [150,151].
Finally, axion couplings to nucleons are constrained by axion emission from the core of supernovae. Converting the results of Ref. [152] to our notation gives the bound F N ≡ 2f a /C N 1.0 × 10 9 GeV, where the effective coupling to nucleons, C N , is given in terms of axion couplings to protons and neutrons as C 2 N = C 2 n + 0.29 C 2 p + 0.27 C p C n . The nucleon couplings are related to the quark couplings (taking a UV scale of 10 12 GeV and including only QCD running effects) [44,153] C p + C n = 0.50 (5) where z = m u /m d = 0.48(3) and δ ≡ 0.038(5)c A ss + 0.012(5)c A cc + 0.009(2)c A bb + 0.0035(4)c A tt . Similar to axion-photon couplings, the couplings to protons and neutrons can be suppressed only by tuning, see Refs. [44][45][46]. However, as discussed in Section III D this bound relies on the validity of the standard scenario for the SN explosion.

VI. AXION COUPLINGS TO TOP QUARKS
Finally we discuss flavor-violating couplings of the axion that involve the top quark. Direct bounds on flavor-violating axion-top couplings are, in principle, accessible at the LHC. Monotop searches [154,155] are sensitive to the t → ca, ua FCNC transitions, and tt+MET [156] to the diagonal tta couplings. However, these searches bound f a only very weakly, at the level of the electroweak scale. Namely, simply restricting the contribution of t → ca to the total width of the top to be smaller than O(1 GeV), gives a bound f a O(v EW ) while the mono-top searches [155] may lead to a bound on f a that is roughly an order of magnitude stronger.
Much more stringent bounds on top couplings to axions can be obtained from virtual corrections. Because of the large top mass, the radiative yukawa corrections from axion-top couplings (Fig. 3) can give sizable contributions to other axion-fermion couplings that are strongly constrained. The leading log expressions for the radiative contributions are derived in Appendix A, with the y t -enhanced contributions collected in Eqs. (A10)-(A14).
The most relevant effects are the radiative corrections to the flavor-violating coupling c V sd , subject to stringent constraints from K → πa (cf. Section III A), and the flavor conserving coupling to electrons, c A ee , which is constrained by WD cooling (cf. Section V). The y 2 t enhanced contributions to these couplings are where µ is the low-energy scale, while on the righthand side the couplings are given at the UV scale f a . These contributions are added to the tree-level sd−a and ee−a couplings, so that from observations we can bound only the sum of the loop induced and tree level coupling, c ij (µ) = ∆c ij (µ) + c ij (f a ). Barring cancellations between the two contributions one can thus obtain bounds on the top quark couplings to axions. The UV values of sd − a and/or ee − a couplings can be suppressed in certain models. A suppressed sd − a coupling arises in scenarios where the down-quark sector is aligned, or all flavor violating axion couplings are strongly suppressed [72][73][74][75][76]. A scenario with suppressed ee couplings instead arises in, e.g., DFSZ-type models where all charged lepton couplings are suppressed if the ratio of the Higgs vevs is small.
Assuming that indeed ∆c ij (µ) c ij (f a ), the radiative contributions to FCNCs (∆c V sd ) and WD cooling (∆c A ee ) give stringent bounds on top-axion couplings.
The strongest constraint on the sd coupling, F V sd 6.8 × 10 11 GeV translates to a bound on the diagonal top-axion coupling F A tt 2.5 × 10 7 GeV, as well as the off-diagonal couplings F V,A tc 3.2 × 10 8 GeV and F V,A tu 7.0 × 10 8 GeV. Here we have assumed real couplings for simplicity (purely imaginary couplings result in a slightly different bound), and set in (56) y t = y SM t (µ = M Z ), f a = 10 10 GeV and used the values of the CKM elements of the "New-Physics fit" of the UTfit collaboration (Summer 2018) [132,133]. With the same numerical inputs one can derive a bound on the diagonal top couplings from WD cooling [32] using the bound F e ≥ 4.9 × 10 9 GeV at 90% CL. This translates to F A tt 3.4 × 10 9 GeV, which is about two orders of magnitudes stronger than the bound from K → πa. Note that similar radiative contributions are obtained for diagonal light quark couplings which can have an impact in the bounds derived from supernovae in specific models.

VII. RESULTS
In Table III  -K + → π + π 0 a -1.7 × 10 7 [104] (7 × 10 8 ) Λ → n a (decay) 6.9 × 10 6 5.0 × 10 6 [10] (1 × 10 9 ) (8 × 10 8 ) Λ → n a (SN) 7.4 × 10 9 † 5.4 × 10 9 † Σ + → pa 6.7 × using the flavor-changing processes discussed in the previous sections. Some of the bounds are afflicted by large theoretical uncertainties which could in principle change the quoted numerical values by as much as an order of magnitude. The affected bounds are: (i) the bounds on F A,V sd from supernova cooling due to Λ → na transition, where the temperature of the PNS and the interpretation of the SN 1987A neutrino events are two important sources of potential systematic errors; (ii) the meson-mixing bounds from ∆m D and from kaon mixing on F V ds suffer from poorly-known theoretical predictions; (iii) and bounds on top-axion couplings rely on additional, model-dependent assumptions on the absence of cancellations with tree-level contributions. In Tab. III all these bounds are flagged by a " †" superscript. Future projections for the bounds based on ongoing or future experiments are given in Tab III inside parentheses and will be discussed below. Furthermore, we recall that all meson-mixing bounds are sensitive to additional contributions from UV physics.
In Fig. 4 we summarize the most relevant bounds for the different flavor sectors and types of couplings, as well as the potential reach of ongoing and future experiments. These are compared to the strongest constraints on the diagonal axion couplings to electrons (WD cooling), nucleons (SN 1987A) and photons (HB/CAST), which were discussed in Section V. For the projected reach on the axion photon coupling "F γ (prosp.)" we quote the prospects for IAXO [146,147]. In Fig. 4 we do not show the bounds from ADMX, since these depend heavily on the assumed axion mass, but for particular axion mass ranges can be much stronger than the bounds from helioscopes.

A. Present bounds
The strongest bound on QCD axion couplings is of the order of 10 11−12 GeV and due to the stringent constraints on F V sd from K + → π + a decays, cf. Table III. This bound exceeds even the stellar axion bounds, F e,N 10 9 GeV, which rely on the diagonal couplings. If the flavor structure is not completely generic, the vectorial sd − a couplings could be suppressed and other probes can become equally or more important (so clearly K + → π + a should not be interpreted as the strongest constraint on the QCD axion independently of the underlying axion model). For instance, as discussed in Sec. II, the sd−a couplings could be strongly suppressed by suitable alignment of PQ-charge matrices and SM Yukawas.
The axial-vector sd − a couplings can be accessed from three-body kaon decay, K + → π + π 0 a, from hyperon decays, and indirectly from K − K mixing, resulting in lower bounds that are in the ∼ O(10 6−7 ) GeV range. In principle, the best bound on the axial sd − a coupling, F A sd O(10 9 ) GeV, comes from hyperon Λ → na transitions in the SN 1987A supernova. In fact, at face value this is the strongest bound of all the axial-vector axion-quark couplings in our analysis. However, as pointed out already above, the supernova bounds should be used with caution due to difficult to estimate systematics. As discussed below, quite impressively the projected improvements on the Λ → na decay branching ratio reach at BESIII will start to compete with this supernova bound, but using well-controlled transitions measured in the lab.
The two-body decays of B mesons probe all the Z A e p 4 z w N C D f P / w / S e G E K d L 7 g c S 3 i i S g g i 0 z w G b f v D m 3 E l + 7 7 U a L L C / + B y r x 7 t 1 w / P D 2 q N Z l H v P N p A W 2 g H R e g I N d A p O k M t R F G G X t A r e g v e g 8 / g K / g e P S 0 F h W Y d / Z l S 6 Q d K 7 L G E < / l a t e x i t > ⇤ ! na < l a t e x i t s h a 1 _ b a s e 6 4 = " n Z L Z n C j i i b r p I 3 p p f V S r q P h X 4 o 0 = " > A A A C S H i c Z V D L S g N B E J y N 7 / i K e v Q y G g R P Y d c H e h S 9 e P C g Y F R w Q + i d 7 e j g P J a Z W T U s + R K v + j P + g X / h T b w 5 S R a J s W G G o r q r 6 K 4 k E 9 y 6 M P w I K h O T U 9 M z s 3 P V + Y X F p e X a y u q V 1  b → s and b → d couplings up to a scale of ∼ 10 8 GeV, with the exception of F A bd due to the absence of dedicated B → ρa searches at the B factories (the related Belle B → ρνν analysis cannot be readily recast, as discussed in Sec. III). The bounds from twobody heavy-baryon Λ b decays and from B d,s − B ds mixing are more than two orders of magnitude less sensitive. Only the bound on the axial-vector bd − a coupling from Λ b decays and B − B, in the 10 6 − 10 7 GeV ballpark, are phenomenologically relevant constraints. Note that in this work we have focused on the case of the QCD axion, such that b → qa transitions result in the axion escaping the detector and a missing energy signature. A more inclusive search strategy, that does not require any information about the axion decay modes is possible using searches for B s → µµa decays [70], though with a reduced sensitivity to the quark-axion couplings compared to the other modes.

a T r Y I G e m A 5 y / k h A H k a t R j + s = " > A A A C R X i c Z V D L T g I x F O 3 4 R H y i S z d V Q u K K z P i I u E P d u D I Y R U 0 Y N J 3 O R R r 6 m L Q d l E z 4 D 7 f 6 M 3 6 D H + H O u N U B J g b x J m 1 O z u 0 5 O T 1 B x J m x r v v u T E 3 P z M 7 N 5 x b y i 0 v L K 6 t r h f V r o 2 J N o U 4 V V / o 2 I A Y 4 k 1 C 3 z H K 4 j T Q Q E X C 4 C T q n g / 1 N F 7 R h S l 7 Z X g R N Q R 4 k a z F K b E r d J b 4 W + P I c e 0 e V w + P + / V r R L b v D w f + B l 4 E i y q Z 2 X 3 C 2 / F D R W I C 0 l B N j G p 4 b 2 W Z C t G W U Q z / v x w Y i Q j v k A R o p l E S A a S b D 2 H 1 c S p k Q t 5 R O j 7 R 4 y O Z L 4 5 K w y y K T i Z 5 G q n H L h A h j e i J I r Q S x b T O 5 G 5 C / u 3 H j J F A 8 n E h n W 5 V m w m Q U W 5 B 0 F K 4 V c 2 w V H v S G Q 6 a B W t 5 L A a G a p f / D t E 0 0 o T Z t N + 9 L e K R K C C L D x I f I 9 I c 3 4 0 o O S v U m K / w P r n f L 3 l 7 5 4 G K / W D 3 J 6 s 2 h T b S N d p C H D l E V n a E a q i O K N H p G L + j V e X M + n E / n a /
The K and B meson decays also probe topquark axion couplings indirectly through loop effects, cf. Sec. II and Appendix A. These constraints become important in the scenarios with down-quark alignment or flavor-diagonal (or MFV) couplings in the UV. In particular, the strong bound on the K + → π + a branching fraction translates into bounds for the t → c and t → u transitions at the level of 10 8 −10 9 GeV. Although we show these constraints in Table III, we did not include them in Fig. 4 as they only constrain the left-handed combination and require the absence of possible cancellations with tree-level sd − a couplings.
Direct probes of flavor violating up-quark-axion couplings that are potentially sensitive to relatively high scales are possible with charmed mesons and baryons. The most sensitive probe of the flavor violating vectorial cu − a coupling turns out to be the two-body D + → π + a decay. Recast of the D + → (τ + → π + ν)ν analysis of CLEO and BE-SIII gives the bound F V cu O(10 8 ) GeV. Axialvector cu − a couplings are currently probed predominantly by D 0 -mixing with lower bounds in the range F A cu 10 6 − 10 8 GeV depending on the CPviolating phase of the axion contribution. These couplings can also be directly probed by Λ c → pa decays, which in the future could provide the best bounds on approximately real F A cu (this case is not included in Fig. 4 for simplicity).

B. Future projections
The sensitivity to the couplings of the flavoredaxion to the quarks can be greatly improved with future experiments. Dedicated searches for a massless axion in two-body kaon decays at NA62 and KOTO are expected to reach a sensitivity to the branching fraction better than 10 −11 [95] (cf. Sec. III). We thus use as the (conservative) experimental projection. As shown in Table III this will allow to push the lower bounds on vectorial sd − a couplings beyond a scale of 10 12 GeV.
The three-body kaon decays K L → π 0 π 0 a can be potentially searched for at KOTO [105]. However, to this date there is no analysis of the sensitivity KOTO could achieve for this decay channel. A direct extension of the current experimental sensitivity for BR(K L → π 0 π 0 a) 10 −6 [104] to the kinematics of a massless axion would give a bound |F A sd | ≥ 6.2 × 10 8 GeV. A very interesting set of probes for axial sd − a coupling is also offered by the hyperon decays. BESIII has a rich hyperonphysics program, and as part of it searches for axions should be attempted. We use the projections for the s → dνν decay modes, estimated for 5 fb 1 integrated luminosity in [115], as the projected bounds on the s → da decays: This will allow to reach scales as high as |F A sd | ∼ 10 9 GeV, entering in the range constrained indirectly by the supernova emissivity bound. Dedicated searches for such s → da hyperon transitions could lead to even more stringent constraints than the above conservative estimates.
All the limits on heavy-quark transitions from two-body meson decays reported in this paper are obtained recasting the analyses of "νν" decays at BaBar and CLEO. These bounds will certainly be improved by dedicated searches in the future. Belle II expects to gather 50 ab −1 of data in the next five years, roughly a factor 100 larger than the final integrated luminosity at BaBar. The gain in the bounds on the branching ratios depends on the scaling behaviors of the backgrounds. In the absence of dedicated experimental projections, we simply assume an "optimistic" scaling inversely proportional to the increase in luminosity with respect to the integrated luminosities on which the respective BaBar analyses were performed. The "conservative" scaling inversely proportional to the square-root of the number of total events would result in slightly weaker bounds. Assuming similar reconstruction efficiencies at Belle II as those achieved at BaBar one can expect an improvement in the sensitivity to F V,A bs and F V bd by at least an order of magnitude, see Table III for the optimistic projections. For the conservative scaling the expected bounds are about a factor 5 weaker.
In case of B → ρa the future projection can be estimated by using the current Belle bound for the "νν" mode in Table I and rescaling with the luminosities. This gives us, which would give the sensitivity to f a 10 9 GeV from the axial bd−a couplings, about three orders of magnitude stronger than the current bound on this coupling from B-B mixing. The expected bound in case of conservative scaling is about a factor of 3 smaller.
Finally, the searches for axions in charm-meson and charm-baryon decays could be undertaken at BESIII and Belle II. For mesons one can rescale the CLEO bound on D → πa by the expected gain in luminosity which is about a factor 20, assuming that BESIII will collect 20 fb −1 , which leads to bound on F V cu that is stronger by a factor 5 (2) for the "optimistic" and "conservative" scalings, respectively. For baryon decays, ∼ 10 4 Λ + c Λ − c pairs have been produced with 567 pb −1 at BESIII [116] while ∼ 10 7 Σ + Σ − pairs are expected with 5 fb −1 [115]. BESIII could therefore reach the limit obtained by naively rescaling with the relative sizes of the samples the projection for the branching fraction of the Σ + → pa decay shown in Eq. (59). This should lead to bounds on the axial cu − a coupling that are comparable with the current bounds from D − D mixing, but with the benefit of no UV model dependence.

VIII. SUMMARY AND CONCLUSIONS
In conclusion, in this paper we explored the phenomenological implications of the possibility that the QCD axion has flavor violating couplings to the SM quarks. Although the presence of flavor violation in axion models is very model-dependent, this scenario generically arises whenever the U (1) PQ charges are not family universal. The flavor violating couplings may even be a necessary ingredient in the UV structure of the theory. This is the case, for instance, if the U (1) PQ group that solves the strong CP problem is part of a larger flavor symmetry group, which is a frequent feature in models addressing the SM flavor problem with approximate horizontal symmetries.
In this paper we investigated in detail the flavor phenomenology of the axion couplings to quarks. We advocate to use a model-independent approach, treating every flavor-violating coupling, vectorial or axial-vectorial, as a different parameter. This is very similar, in spirit, to the global analyses to search for heavy new physics in low-energy experiments and the LHC using the SM effective field theory. Throughout this work we have assumed the limit of a practically massless axion. Much of the framework developed in this paper can be extended straightforwardly to massive axion-like particles with the appropriate changes due to the different kinematics involved.
The present paper goes beyond previous studies in the same spirit [32][33][34] in several ways: (i) We critically examined the bounds that can be derived using two-body decays of heavy-mesons with final state axions. By recasting BaBar data of B → K ( * ) νν and B → πνν one can derive limits that supersede the old CLEO direct searches of B → Ka and B → πa. We also find that the Belle analyses on the "νν" modes cannot be recast for this purpose. Thus, there is no bound from two-body decays on F A bd , with the best limit currently given by Λ b → na decays and B − B mixing.
(ii) We derived the strongest direct limit in the up-quark sector by recasting data on D + → (τ + → π + ν)ν as a search for the D + → π + a decay.
(iii) We provided the theoretical framework and phenomenological analysis of new processes not considered before, such as the three-body K → ππa decays and baryon decays. We argue that baryon decays can in the future give the best sensitivity to several couplings of the axion: the CP-conserving cu − a couplings from Λ c → pa decay, or the axial vector sd − a coupling from hyperon decays.
(iv) We derived the strongest limit on axial couplings using a novel interpretation of the supernova SN 1987A bound, based on the impact of the process Λ → na on the neutrino emissivity of the hot proto-neutron star.
(v) We developed a theoretical framework to extract reliable limits from neutral-meson mixing. This is based on the effective field theories of QCD; chiral-perturbation theory for the case of kaonmixing and operator product expansion for the case of heavy-meson mixing.
Our main results for present and future constraints on flavor-violating axion couplings are summarized in Table III and Fig. 4. For several of the considered modes a significant improvement is expected. Precision flavor facilities can potentially test PQ breaking scales up to the order of 10 12 GeV (NA62 and KOTO) and 10 9 GeV (Belle II and BES III), if dedicated searches are performed in ongoing experiments at strange, charm and bottom factories. This reach actually falls into the most interesting region of the parameter space where the axion can account for the observed dark matter abundance or possibly explain various mild hints for anomalous stellar cooling. These expectations strongly motivate a comprehensive experimental program of searching for axions in rare flavor-changing transitions, which may well lead to the first discovery of the QCD axion. Above the electroweak scale the Lagrangian describing the interactions of the axion with the SM Higgs and the fermions take the form where In the first two lines we included the relevant parts of the SM Lagrangian -the fermion kinetic terms, the Higgs kinetic term and the Yukawa interactions. For axion interactions we keep only the lowest dimension operators and allow for general flavor structure. The Yukawa matrices y f ij are general 3×3 matrices, while the axion interactions are described by hermitian 3 × 3 matrices c f ij . The Yukawa interactions are invariant under the axion-dependent, flavor-diagonal field redefinitions where α is a free parameter and Y ψ denotes hypercharge. Under these transformations the axion couplings shift as Thus one can always employ the above field redefinition with α = c H to get rid of the c H operator, and shift the axion couplings to fermions as in Eq. (A3), which gives Eq. (1). The axion couplings to the gauge fields also do not change, since the transformations in (A2) are non-anomalous -they correspond to a phase shift of each fermion field that is proportional to its hypercharge Y ψ . When performing the renormalization group (RG) evolution this field redefinition needs to be performed at each scale µ, in order to keep c H (µ) = 0. The RG equations for the couplings in matrix notation, keeping c H = 0, are given by 4 (see, e.g., Ref. [157]), e y e − c y e y † e −2 c H Tr 3y u y † u + 3y d y † d + y e y † e . (A9) Using these equations, one can express the lowenergy couplings in terms of high-energy couplings, diagonal SM Yukawas and the CKM matrix V . Keeping only the effects proportional to the top yukawa coupling, the radiative corrections to the axion couplings, ∆c V,A fifj , are given by The flavor-universal contributions arise from a nonzero c H which is radiatively generated and then through field redefinitions absorbed into c fifj at low energies. Note also that the high-energy couplings satisfy N bg = 13.5 ± 1.0, for a total number of tagged D decays N tot = 4.6 × 10 5 and single pion detection efficiency π = 0.89. This results in the 90%CL upper bound BR(D → πa) ≤ 8.0 · 10 −6 , following the same statistical prescription as before. A recast of the recent D + → τ + ν BESIII analysis [100] using pion like events from the τ + → π + ν channel, Fig. 3 right in Ref. [100], results in a bound that is about a factor 2 weaker than our recast of the bound from CLEO.

Appendix C: Hadronic matrix elements
In this appendix we give further details on the hadronic elements describing axion induced flavor changing transitions. The numerical values of different inputs entering the predictions are collected in Table IV.

Two-body decays
We first give the parametrizations of matrix elements for two-body hadron decays H → H a, where H ( ) is a pseudoscalar meson, a vector meson or a spin 1/2 baryon. The transitions are induced by quark level transitions of the type q → q a. The form factors in the resulting hadronic matrix elements are functions of the momentum exchange squared, q 2 = (p − p ) 2 = m 2 a 0, i.e., for the predictions we only need the values of form factors at q 2 = 0.
The hadronic matrix elements for P → P a transitions, with P ( ) pseudoscalar mesons, are parametrized by two sets of form factors P (p ) |q γ µ q|P (p) = P µ f P P + (q 2 ) + q µ f P P − (q 2 ), (C1) where P µ = (p + p ) µ . The related matrix element of the axial current q γ µ γ 5 q vanishes by parity invariance of the strong interactions. In the decay the Lorentz index in (C1) is contracted with −iq µ from the derivative of the axion field, cf. Eq. (1). The only hadronic inputs needed to describe the P → P a decays are therefore f P P + (0). For f K + π + + (0) we use the N f = 2 + 1 + 1 lattice QCD determination of f K 0 π − + (0) [160], since in the isospin symmetric limit f K + π + + (0) = f K 0 π − + (0). Likewise, we use for the axion induced charm meson decays the lattice QCD calculation of f D 0 π − + (0) reported by the ETM collaboration [163], along with the isospin relation f D + π + + (0) = f D 0 π + + (0). For the B + → K + a decay we use the lattice QCD determination of the form factors by the Fermilab/MILC collaboration [165], while for the B + → π + a decay we use the light-cone sum rule determination of the form factors in [168].
The hadronic matrix element for the decays of a pseudoscalar meson P into a vector meson V is given by V (p , η)|q γ µ γ 5 q|P (p) = i(η * · q) q µ q 2 2 m V A 0 (q 2 ) where η is the polarization vector of the vector meson. Parity conservation implies that the matrix element of the vector current, V (p , λ)|q γ µ q|P (p) , transforms as an axial vector and must be ∝ µνρσ p ν p ρ η σ . This gives a vanishing contribution to the decay amplitude upon contraction with the derivative interaction of the axion. Furthermore, contracting Eq. (C2) with −iq µ , and taking into account that one finds that A 0 (0) is the only hadronic input needed to describe the P → V a decays. For the B → K * and B → ρ form factors we use the lightcone sum rules determinations from Ref. [170]. The hadronic matrix elements for B → B a decay, with B ( ) the spin-1/2 baryons, are parametrized by B (p )|q γ µ q|B(p) = u (p ) f 1 (q 2 ) γ µ B (p )|q γ µ γ 5 q|B(p) = u (p ) g 1 (q 2 )γ µ + g 2 (q 2 ) M σ µν q ν + g 3 (q 2 ) M q µ γ 5 u(p).

K → ππa
The hadronic matrix elements needed for the decays K + → π + π 0 a and K L → π 0 π 0 a can be obtained from the form factors measured in the decay K + → π + π − e + ν using isospin symmetry. The matrix element of the axial-vector current for the latter process is defined as [110,111], where q = p − p + − p − is the four-momentum of the axion and thus q 2 0. Once the above hadronic matrix element is contracted with −iq µ from the derivative on the axion field, the contribution of R to the decay matrix element vanishes at the physical point q 2 = 0. The related matrix element of the vector current completely vanishes due to parity invariance once it is contracted with the axion-field derivative, similarly to the P → V a decays discussed above.
The K → ππ form factors depend on three kinematic variables which can be chosen to be q 2 , s = (p + + p − ) 2 and cos θ π , where θ π is the angle between the positively-charged pion's and kaon's three-momenta in the di-pion rest frame. The form factors are complex functions with a strong phase arising due to the rescatterings of the pions. We follow a standard parametrization of the form factors using a partial wave expansion of the two-pion system [110,111], and truncate it at the p-wave, F = F s + F p cos θ π exp(−iδ), (C7) G = G p exp(−iδ). (C8) Here δ = δ s − δ p is the difference of s-and p-wave π + π − phase shifts. The prefactors F s , F p and G p are only functions of q 2 and s. In the experimental analyses of K → ππeν decays they are often Taylor expanded around q 2 = 0 and s = 4m 2 π . For the K → ππa decays we only need their values at q 2 = 0, retaining the parameters of the expansion in the dimensionless variable s = (s/4m 2 π ) − 1, around s = 0, F s = f s + f s s + f s s 2 , F p = f p , G p = g p + g p s, which are determined from the experimental data [106,107] and shown in Table IV. In order to connect these form factors in the K + → π + π − e + ν channel to those in K + → π + π 0 a and K L → π 0 π 0 a, one uses the isospin-symmetry relations [108,109] with the convention that (u, d) and (−d, u) transform as isodoublets, π + π 0 |sγ µ γ 5 d|K + = − √ 2 (π + π − ) I=1 |sγ µ γ 5 u|K + , (C10) π 0 π 0 |sγ µ γ 5 d|K 0 = (π + π − ) I=0 |sγ µ γ 5 u|K + , where the subscripts on the right-hand sides indicate that we have projected the isospin wave functions of the two final pions to either I = 0 or I = 1. The total amplitude must be even under the exchange of the two pions (Bose symmetry). Therefore, for the isospin symmetric I = 0 (anti-symmetric I = 1) wave function only the s-wave (p-wave) components contribute. A final observation is that one does not have interference in the total rates between s-and p-wave components in these axion decay channels and they are insensitive to the strong phase δ.

Neutral meson mixing
Hadronic matrix elements of the four-quark operators Eq. (37) involved in the axion contributions to heavy neutral-meson mixing are conventionally defined in terms of the so-called bag parameters, B i . For the case of B − B mixing and shortening B 0 |O i |B 0 = O i these are defined as, with the values for η q i (m b ) and B (i) Bq (m b ) as provided in [138]. These definitions are straightforwardly extended to the short-distance contributions in the other neutral-meson systems. The values of the different parameters in these equations are obtained from lattice calculations. In case of the charmmeson oscillations we use results in ref. [139] which directly provides the results in terms of the matrix elements O i at µ = 3 GeV.