New perspective in searching for axion-like particles from flavor physics

We propose new perspective in searching for axion-like particles (ALPs) from quark and lepton flavor physics: measurements of the time-dependent CP asymmetry in B → K Sπγ and the branching ratio of Bs → e±μ∓ decay possess, along with the anomalous magnetic moment of muon. In the mass range of sub-GeV, accessible by the flavorful ALPs search, the experimental sensitivity for these flavor observables reaches the maximum at around the pion mass scale (called the sweetest spots), where a couple of loopholes (unexplored regions) for the ALP parameter space have heretofore been present, because of an unavoidable contamination with pion background events. The proposed complementary probes can precisely determine the ALP coupling to photon at these sweetest spots/loopholes, and will significantly help cover whole parameter spaces in the ALP search including the present loopholes in the future.

from neutral pions decaying to diphoton at beam dump experiments [14].
The limit on such 140 MeV ALPs can be placed by mono-photon and tri-photon searches accessible at LEP, Tevatron and LHC, through Z(γ * ) → a + γ (a = ALP) [15][16][17][18][19]. However, this channel also contaminates with the neutral pion events such as Z → π 0 + γ, which is a rare decay channel and has not observed or yet uncovered even for the SM prediction. A conservative bound has been addressed by simply applying the upper limit on the SM branching ratio of Z → π 0 + γ to the 140 MeV ALP [18,19], which will however be insufficient because of the pion contamination. Thus, this kind of direct ALP productions in collider experiments will not supply "clean/disentangled" signals for the 140 MeV ALP.
Observations of supernova [20] are free from the pion contamination, and provide bounds on the ALP-photon coupling (see, e.g., [21,22]). Very recently, it has also been pointed out [23] that the effective degrees of freedom for relativistic particles today and possible effects on the Big Bang Nucleosynthesis can give a significant limit for ALPs, and the coupling to photon at around 140 MeV is severely bounded from below. However, still, the loopholes will keep open (see Fig. 1).
One should realize, on the other side of the same coin, that the existence of the loopholes may provide us with "sweet spots" in the ALP search: As seen in Fig. 1 such an ALP is to be short-lived enough to decay to diphoton inside detectors, as is the case of the neutral pion. In this sense, B 0,± → K 0,± + a search cannot constrain the ALP. Thanks to this fact, the ALP-fermion coupling can be enlarged to enhance other observables. Among meson decays, in particular, B 0,± → K 0,± + a is the severest [8], hence the ALP contributions to presently unexplored b → s transition amplitudes are maximized at around 140 MeV for the overall mass range GeV. Note also that when the mass goes off the target scale (i.e. above B meson mass scale), the sensitivity to the (g − 2) µ gets lower and lower, due to the decoupling effect [11]. Therefore, the "sweet spot" can be indeed sweetest for a generic ALP to have the highest discovery sensitivity, and would give us the biggest chance to hunt the ALP responsible for the (g − 2) µ as well. Also, it would be expected to provide us with a nontrivial correlation between quark and lepton flavor physics in the presence of the ALP.
In this paper, we propose the combined measurements of the time-dependent CP asymmetry in B 0 → K 0 S π 0 γ and B s → e ± µ ∓ decay, along with the (g − 2) µ , as the new experimental probes having the highest sensitivity for the flavorful ALPs at around 140 MeV. Remarkably enough, the deviation of the time-dependent CP asymmetry from the SM prediction can be seen, whichever the (g − 2) µ is consistent with the SM, or not. The proposed complementary measurements can precisely determine the 140 MeV ALP coupling to photon, will fully cover the present loopholes as well, and hence, make a significant help to explore all the parameter space of the ALP.

II. ALP COUPLINGS
We begin by introducing a generic ALP model. The ALP (a) couplings to quarks and leptons, involving the decay constant f , arise generically as vector or axialvector current forms, because of the inherit Nambu-Goldstone boson nature. They can be written without loss of generality as where d i and i are down type quark and charged lepton fields, with i being the generation index (i = 1, 2, 3), the couplings g f V,A are hermitian matrices, and we have disregarded couplings to up type quark and neutrinos, which are irrelevant to our current proposal. The ALP a generically couples also to the photon like with α being the fine structure constant of the electromagnetic coupling, and the inclusive effective coupling denoted as C eff γγ including both direct and indirect a-γ-γ vertices, the latter of which can be induced from charged fermion loops.
We focus on the following ALP couplings, (g d V,A ) 23 , (g d A ) 33 , (g V,A ) 12 , (g A ) 22 , and C eff γγ , which are relevant to the existing constraints from flavor observables at the ALP mass around 140 MeV, such as B s -B s mixing (C Bs defined in Eq. (A4)), the CP asymmetry in B 0 → K 0 S π 0 γ (S CP defined in Eq. (3)), B s → e ± µ ∓ , Υ decays to γa and µ + µ − , leptonic B s meson decays to µ + µ − and e ± µ ∓ , (g − 2) µ (∆a µ = a exp µ − a SM µ ), and µ → eγ. The sweetest spots in the ALP parameter space, having the highest sensitivity to prove the flavorful ALP below ∼ GeV, which are presently loopholes (blank domains). Current bounds are set on C eff γγ normalized to the ALP decay constant f , defined in Eq. (2), for the ALP mass ma ∼ 140 MeV. The cyan shaded area is excluded by a conservative upper limit on Z → π 0 + γ, which has currently been most stringently placed by the LEP searches [15,[17][18][19], without taking the π 0 contamination with the 140 MeV ALP into account. The orange and blue shaded domains are ruled out by SN1987A [24] and the observation of the effective degrees of freedom for relativistic particles today, ∆N eff [23], respectively. Here, the bound from ∆N eff has been set at 95% C.L.. The limits from SN1987 and ∆N eff assume the ALP dominantly coupled to diphoton. Bounds from the beam dump experiments are currently inapplicable in the displayed mass range due to the serious contamination with the neutral pion backgrounds.
Among them, as it turns out later, the CP asymmetry in B 0 → K 0 S π 0 γ (S CP ) is the most important observable, because it can be deviated from the SM prediction almost irrespective to others. Thereby, in this section we discuss the physics on S CP by particularly paying our attention to the ALP contribution forms arising from the Lagrangian (Eqs. (1) and (2)) at loop level. For the other observables, the details are shown in Appendix A.
The time-dependent CP asymmetry in B 0 → K 0 S π 0 γ decay S CP is evaluated as [25,26] S CP ≡ Im e −2iβCKM (C * 7 C 7 + C 7 C 7 * ) where the Wilson coefficients C Here C ( )NP, arch 7 correspond to contributions from the ALP exchange (arch) graphs at the one-loop level, where V tb and V ts are the Cabibbo-Kobaashi-Maskawa (CKM) matrix elements and G F is the Fermi constant, and The loop integrals I ±± k,1 can be found in the Appendix of Ref. [28], and k denotes the quark flavor flowing in the loop, where k = 1, 2 and 3 correspond to the contribution of down, strange and bottom quarks, respectively. Since g ij s(p) is proportional to the difference between (sum of) the masses of i-and j-th generations of down type quark, the dominant contribution comes from the bottom quark loop, i.e. k = 3. Noting also g ii s = 0, we thus find that the dominant part of C ( )NP, arch 7 depends on three couplings: . It is then clear to see that C ( )NP, arch 7 = 0 when (g d A ) 33 = 0. The coupling C eff γγ in Eq. (2) also contributes to b → sγ processes, through the Barr-Zee (BZ) type loops [29]. This contribution, denoted as C ( )NP, BZ 7 in Eq.(4), can be evaluated in a way analogous to the BZ type correction to the µ → eγ amplitude in Refs. [7,30]. This is given as follows: where x b ≡ m 2 a /m 2 b and g γ (x) is the loop function which is found in the Appendix of [30].

IV. ALP COUPLING CORRELATIONS IN FLAVOR OBSERVABLES
As to other flavor observables that we are presently interested in, the explicit formulae and the existing limits are presented in Appendix A. To summarize, all the relevant flavor observables are given as functions of the ALP couplings (up to the decay constant factor 1/f ): We first note that (g d A ) 33 can be determined solely by BR(Υ → γa). We also notice that once (g d A ) 23 is fixed, we can then determine (g d V ) 23 as a function of C Bs and φ Bs . The other useful features that we note are BR( 23 | and c eµ can be written as functions of (g A ) 22 , C eff γγ and branching ratios of B s → e ± µ ∓ and/or µ → eγ.

V. COVERING THE 140 MEV LOOPHOLES
Although the ALP dominantly decays into γγ, the diphoton signal cannot be distinguished from the neutral pion signal, so that there are loopholes at around the pion mass of 140 MeV [14] (and also see Fig. 1). However, such loopholes can be resolved and covered by measurements on the time-dependent CP asymmetry in B 0 → K 0 S π 0 γ (S CP ), and B s → e ± µ ∓ , together with the (g − 2) µ (∆a µ ), that is our main proposal, and will be demonstrated below shortly.
As one reference point, we may fix the flavor observable values as follows: where the measured values have been set to those central values, and upper limits have been taken from the future prospects as benchmark values, which are given in Appendix A. Note that according to Ref. [12], the upper bound on c eµ from the muonium-antimuonium oscillation can be read as This c eµ is mostly sensitive to BR(B s → e ± µ ∓ ), because it is solely scaled by |c eµ | 2 (see Eq. (10)). In order to be consistent with this bound and Eq. (11), we therefore set This is actually taken to be the upper bound for the present analysis: We have checked that when BR(B s → e ± µ ∓ ) 10 −12 , there exists no solution which satisfies Eq. (12). Thus, we can evaluate the time-dependent CP asymmetry parameter S CP as a function of C eff γγ . We see the following features: • As seen from Eq. (3) the structure of new physics term in S CP is shared by two contributions proportional to the CP phase in the SM, or new physics phase, which currently come from the presence of the imaginary part of ALP coupling (g d R ) 23 . Actually, a sizable deviation from the SM prediction (C NP 7 = O(0.001)) can be generated even when C NP 7 has no imaginary part (i.e. solely with the KM phase in the SM). The deviation of our prediction has been enhanced by nonzero imaginary part of C NP 7 coming from Im[(g d R ) 23 ]. • S CP can be large enough when |C eff γγ | > O(100)/TeV. The stringent constraint from the muonium-antimuonium oscillation in Eq. (12) (or Eq. (13)) gives the significant impact on S CP as a function of C eff γγ /f . In fact, when the current size of deviation on ∆a µ (the benchmark value in Eq. (11)) is fixed, we find that they are almost stuck to which are compared with the SM prediction, S SM CP −0.0269 [31].
• We also found that when ∆a µ deviates from the benchmark value in Eq. (11), the predicted value of C eff γγ is changed to get larger monotonically with increasing ∆a µ and vice verse. In this sense, it is possible to explore the loopholes in Fig. 1 in light of the fate of ∆a µ in the future. Particularly, persistence of nonzero ∆a µ will make S CP cover the blank range determined by the upper part of the SN1987A constraint and the lower part of the LEP limit.
Another interesting reference points to the case where ∆a µ = 0 in our future. The result for S CP is shown in Fig. 2. In this case, BR(B s → e ± µ ∓ ) can be larger than 10 −12 without conflicting with Eq. (12), and hence, we set as a benchmark value, which corresponds to the prospected upper bound reported from the LHCb [32]. The other input values are set to the same ones as in Eq. (11), except for ∆a µ . Then the following features are detected and deduced: • The deviation becomes larger than the case with the current central value of ∆a µ . This is because |(g d V,A ) 23 | is allowed to be larger in this case: First of all, a small |C eff γγ | is favored to realize the benchmark values in  Bs → e ± µ ∓ from the SM prediction, for the estimated ∆aµ = 2.61 × 10 −9 (red shaded domains). The solid (dashed) red lines correspond to C eff γγ ≥ 0 (< 0). The red shaded bands have been created by choosing C eff γγ in an allowed region, in Eq. (A20).
Eq. (11) with ∆a µ = 0. Hence (g A ) 22 is also desired to be smaller for ∆a µ = 0. This implies that dominant contributions to the S CP from the ALP exchange loops (along with |(g d V,A ) 23 |) and the BZ type loops (with C eff γγ coupling) are interchanged. A large (g d A ) 23 is required to achieve the benchmark value of B s → µ − µ + in Eq. (11). For such a large (g d A ) 23 , the values of C Bs and φ Bs are realized by a cancellation between (g d V ) 23 and (g d A ) 23 , which results in a large (g d V ) 23 as well. Thus both |(g d V,A ) 23 | with ∆a µ = 0 become larger by about two orders of magnitude than those with ∆a µ = 2.61 × 10 −9 . Therefore a larger S CP can potentially be predicted with a smaller C eff γγ which, with opposite sign, destructively contributes against the dominant ALP exchange loops along with |(g d V,A ) 23 |.
• For such a small C eff γγ c eµ has to also be small to destructively interfere and realize ∆a µ = 0. Indeed we have c eµ = O(10 −2 ), which satisfies the muonium-antimuonium oscillation bound in Eq. (12).
Varying the reference value for BR(B s → e ± µ ∓ ), we may also evaluate the correlation with S CP . See Fig. 3 for ∆a µ = 2.61 × 10 −9 and Fig. 4 for ∆a µ = 0. In these figures we have kept using the benchmark values except for BR(B s → e ± µ ∓ ). The red shaded bands have been created by choosing C eff γγ in an allowed region, in Eq. (A20). For ∆a µ = 0 case, δS CP is constrained in two narrow bands around δS CP ∼ 0.17 and δS CP ∼ −0.15, and these are almost independent of the value of BR(B s → e ± µ ∓ ). This can be understood as follow: Because of the tiny value of BR(B s → e ± µ ∓ ), (g d A ) 23 is smaller than (g d V ) 23 . In this case, both C ( )N P 7 depend highly on the size of (g d V ) 23 . Its size is almost determined by C Bs , hence is independent of BR(B s → e ± µ ∓ ) value. For ∆a µ = 0 case, on the other  Fig. 3 for ∆aµ = 0. The brown shaded region on the right side is excluded at 90% C.L. [33], and the brown dashed line shows the future prospect for the Bs → e ± µ ∓ at LHCb [32].
hand, S CP has a strong sensitivity on BR(B s → e ± µ ∓ ), because (g d A ) 23 gives contributions to S CP on the same order as (g d V ) 23 does, as explained above. We can change the benchmark value of BR(µ → eγ). The expected results can be easily deduced by following the discussion above: • Note the scaling of |(g d A ) 23 | ∝ 1/ BR(µ → eγ). For ∆a µ = 0 including the reference point ∆a µ = 2.61 × 10 −9 , the size of δS CP is fairly independent of BR(µ → eγ), because of the weak sensitivity of |(g d A ) 23 | to δS CP . Hence δS CP can be larger than O(10)%, no matter how much smaller BR(µ → eγ) is going to be.
So, at any rate, δS CP is predicted to be large enough. Note that even the largest values displayed in Fig. 4, δS CP ∼ ±4, which correspond to S CP ∼ 0.08 and S CP ∼ −0.13, are consistent with the current experimental limits [34,35] within 2σ.
One may also notice that the results in Figs. 2 and 4 depend on the size of (g d A ) 33 , namely, which has been obtained by taking the upper limit on Υ → γa. It turns out, however, that even when we choose (g d A ) 33 = 0, the size of δS CP can be larger than O(10)%. As to the case for ∆a µ = 0, we have checked that the results in Fig. 3 are almost independent of (g d A ) 33 . Thus, no matter what future we may or may not have with the NP term in (g − 2) µ , or in BR(µ → eγ) and/or BR(Υ → aγ), δS CP can be larger than O(10)%, which would be testable in future experiments. Though other new physics candidates might also give a similar contribution to the S CP , the complementary flavor observations would distinguish the 140 MeV ALP from others. Moreover, if both S CP and BR(B s → e ± µ ∓ ) are observed, the size of the ALP coupling to photon can be determined from Eq. (14) and Fig. 3 for ∆a µ = 0 and Figs. 2 and 4 for ∆a µ = 0.
Regarding experimental prospects, even at the Belle II with 50 ab −1 data [26], it will still be hard to observe the signal of S CP since the future uncertainty will just go down from the current one 0.3 to 0.03, with assuming the current central value −0.29. This is the value quoted from Ref. [26], which the current analysis follows. The HFLAV experiments report the central value −0.15 ± 0.20 [34], and Particle Data Group adapts −0.78 ± 0.59 ± 0.09 [35]. All of them are consistent each other within the one sigma uncertainty. The large uncertainty on the SM prediction essentially comes from the uncontrollable long-distance QCD contributions. We, however, hope the future experiments to have a potential to give us some hints in searching for 140 MeV ALPs. Actually, an interesting possibility to decrease the uncertainty from long-distance contributions has recently been proposed, by combining the observations of the time-dependent CP asymmetry for exclusive B meson decays into vector or axial-vector meson and photon [36]. This method will be utilized to measure the S CP at Belle II 1 , and may help decrease the major theoretical uncertainty in the SM prediction and improve the future experimental sensitivity. If future experiments could shrink its uncertainty with a central value significantly deviated from the SM prediction, flavorful 140 MeV ALPs might favor to have the (g − 2) µ consistent with the SM one.

VI. CONCLUSION AND DISCUSSIONS
In conclusion, the discovery potential of the time-dependent CP asymmetry in B 0 → K 0 S π 0 γ decay has the highest sensitivity to probe flavorful ALPs at around 140 MeV. The predicted deviation from the SM can be larger than O(10)%, and turns out to be fairly irrespective to the presence of new physics in the anomalous magnetic moment of muon, or Υ decays, or µ → eγ, which will in the future fully cover the current sweetest spots/loopholes. Complementary measurements for the CP asymmetry in B 0 → K 0 S π 0 γ and B s → e ± µ ∓ , in conjunction with the anomalous magnetic moment of muon, can precisely determine the ALP coupling to photon at around 140 MeV, and will give a significant help to fully cover the ALP parameter space including the regions which cannot be explored by existing and prospected direct search experiments.
Several comments and discussions are in order: • As to the future prospect regarding the direct ALP production bound in the beam dump experiments, it has recently been reported [37,38] that the ALP detection sensitivity can be increased by an improved analysis on the Primakoff scattering with nucleus, but the signal over backgrounds at around 140 MeV gets weak enough and still suffers from the contamination with neutral pions, in contrast to an optimistic prospect in [13]. This contamination problem also involves the photoproduction of ALPs in the PrimEX and GlueEX experiments [39].
• We have included the collider experimental limit on the ALP, from LEP in Fig. 1, but actually it still suffers from the contamination with the neutral pion background, in a sense similar to the beam dump experiments. In Ref. [19], Higgs-mediated processes such as h → Z + a and h → aa as well as Z → a + γ have been discussed and are shown to have high sensitivity to probe the ALP signal for a wide range of the ALP parameter space including the 140 MeV ALP. However, those should still suffer from the contamination with the neutral pion background at the ALP mass around 135 -140 MeV, which is not argued in the literature. (Actually, the authors have clearly mentioned "We are not in a position to provide detailed estimates of detector and reconstruction efficiencies, or to perform solid background estimates." in the paper (the fifth line from bottom, page 32, for the published version). Therefore, the prospected plots in Figs. 17 and 23 in the literature cannot reliably be applied to the loophole/the sweetest spots for the flavorful 140 MeV ALP, hence will not fully cover the ALP parameter space, just like the case of beam dump experiments -Any of direct 140 MeV ALP productions at collider experiments cannot be free from the pion contamination. Our proposal in a view of flavor physics is totally free from the pion background, and leads to the highest sensitivity as the "sweetest spot" in the ALP full parameter space. This is how in Fig. 1 the existing LEP (and also Tevatron, LHC) limits on the ALP around 140 MeV have been placed by simply applying the upper limit on the SM prediction Z → π 0 + γ event, without taking into account the pion contamination, not by isolated ALP signals.
• In Ref. [40] it has been addressed that the enhanced monophoton plus a large missing energy signature can cover one of loopholes around 140 MeV above the supernova constraint in Fig. 1. However, the other loophole below the supernova lower limit cannot be probed because their ALP cannot be longer-lived by such a scenario construction, while our present proposal can do it. Moreover, such a sterile-ALP coupling scenario is somewhat specialized, and beyond our scope keeping generality.
• Comparison with the long-lived particle (LLP) detection experiments (e.g. FASER and SHiP) is shown in Fig. 5.
In the figure, we display the prospected sensitivity of the LLP search projected onto the ALP parameter space, by quoting the references ( [41] for FASER and [42] for SHiP). Those LLP experiments get less sensitive when the a − γ − γ coupling is so large that the lifetime (decay length in the experiments) becomes too short to reach the detectors. Actually, as seen from the figure, the ALP cannot be probed when the a − γ − γ coupling is of O(100/TeV). On the other side, the LLP experiments will also fail to detect the ALP when the lifetime gets long enough that the ALP does not decay inside the detector. The corresponding boundary will coincide with the upper limit placed by the supernova constraint. Thus, the domain surrounded by the lower limit from the supernova and the upper limit from the ∆N eff will still remain as an unexplored region, which corresponds to the a − γ − γ coupling of O(10 −4 /TeV − 10 −3 /TeV). Other prospected LLP experiments (e.g. Belle II) have the similar detectability limit (bounded from both above and below). Furthermore, to place the bound, those LLP experiments need to assume the ALP coupling to diphoton to be dominant compared to couplings to SM fermions, where the latter would actually be important to discuss ALP significance on flavor physics, as addressed in the manuscript. (In particular, the Belle II's paper [43] clearly states that they do not have sensitivity enough to resolute the ALP signal at around the neutral pion mass.) Our proposal given in the present paper actually covers all the parameter space, by monitoring the ALP contribution to the muon g − 2, including the two domains that the LLP experiments cannot explore: the top blank region in the LEP, Z → π 0 γ limit  Fig. 1. The characteristic accessibility is the blank domain, below the supernova lower limit and above the FASER prospected upper limit, hence the presently proposed ALP probe will fully cover the 140 MeV ALP parameter space.
above Fig. 5 is precisely the benchmark regime in which the anomaly on the muon g − 2 can be explained by the 140 MeV ALP (see Eq. (14)) while going down to the bottom blank region, one cannot see the significance of ALP in the muon g − 2 (Fig. 2). Both two cases predict a sizable contribution to the time-dependent CP asymmetry in B 0 → K 0 S π 0 γ and B s → e ± µ ∓ decay. Thus, our proposal should have the novelty, when compared to the existing and also prospected experiments even including the LLP detection experiments.
With this exclusion or discovery potential at hand, possible implications to the underlying theory for such flavorful axion-like particles (e.g. responsible for the origin of its mass and couplings ) would be worth exploring, as well as the impact on thermal history of universe. The region of the favored parameter space as in the figures implies f ∼ TeV × (C eff γγ /100). Supposing C eff γγ < O(1), one gets f 10 GeV, which could thus be much smaller than the typical QCD axion decay constant (∼ 10 9−10 GeV) or flavon-like axion's [44,45]. When one naively embeds the present ALP model into a linear sigma model description such as flavon models, this small f would result in further constraints from four fermion operators among the SM quarks and leptons, induced by the exchange of the radial sigma mode. Or, it would turn to a constraint for the presence of some new (hidden) QCD-like dynamics responsible for the 140 MeV ALP, having the intrinsic scale at almost the same scale as QCD, f ∼ 100 MeV, which implies somewhat smaller C eff γγ ∼ 10 −2 . More conservatively, when the coupling is at most at a perturbative limit, i.e., C eff γγ < O(4π), we would have f ∼ 100 GeV. At any rate, NP particles with mass on the order of the f constant scale would need to be well secluded from the SM particles, to avoid the existing collider experiment limit. This is highly model-dependent, which is beyond the current scope. More on this issue is worth pursing elsewhere.

Appendix A: ALP Flavor Observables
In this Appendix, the flavor observables and constraints relevant to the present ALP study (other than S CP ) are listed. The relevant couplings for the analysis are

Neutral B meson mixing
The ALP contribution to the B s -B s mixing including the mass range around 140 MeV has recently been studied [8,10]. According to Ref. [10], the ALP contribution to ∆m Bs is estimated by using the latest lattice results [46]. We refer readers to the literature for details. The resultant form goes like The measured value is ∆m Bs = 1.1688(14) × 10 −8 MeV [47], with which the SM prediction is consistent [46]. To find the possible size of new physics (NP) contributions, we use results from the UTFit collaboration [48,49]. Then it is convenient to define the following form for the B s mixing parameters: Note that ∆m Bs = 2| B 0 s |L eff |B s 0 |, and then, the SM prediction points to C Bs = 1 and φ Bs = 0. By the global fit to CKM observables, we find the best fit values C Bs = 1.110 ± 0.090 and φ Bs = (0.60 ± 0.88) • . In the main text, these observables have been used to determine the effective coupling combination (g d V ) 23 /f in Eq. (A1).

Radiative bottomonium decay
The process Υ → γ E T was searched by the BaBar, and the current upper limit on the branching ratio is 4.5 × 10 −6 at 90% C.L. for the case where the invisible state is a light scalar with mass m a < 8 GeV [50]. This process can be used to constrain the coupling combination (g d A ) 33 /f arising from Eq. (A1). As in Ref. [51], the branching ratio normalized to BR(Υ → µµ) can be estimated as When we use the experimental value of BR(Υ → µµ) = 2.48 × 10 −2 [47] assuming negligible ALP corrections, the upper bound on the (g d A ) 33 /f can be read as

Leptonic B meson decays
The B s → i¯ j decay width can be estimated by the couplings in the Lagrangian Eq. (A1) as where m Bs and f Bs are the mass and decay constant of B s meson, r i ≡ m i /m Bs with m i being a mass of i-th generation of charged lepton, and r a ≡ m a /m Bs . Here, λ(x, y, z) = x 2 + y 2 + z 2 − 2xy − 2yz − 2zx. Note that this decay width is symmetric under r i ↔ r j . For B s → µ − µ + decay, we should consider the interference between the SM and NP contributions. Therefore, we use the generic form of branching ratio given in Ref. [52]: In the generic ALP model, NP contributions are induced in C ( ) P , arising as the coefficient of the pseudoscalar current of m b (sP R(L) b)(¯ γ 5 ). Thus we focus only on the C P − C P term to get , with V tb and V ts being the CKM matrix elements and G F the Fermi constant. The current experimental result and the SM prediction for B s → µ − µ + are [47,53] form which we note the SM to be consistent with the experimental result within 1.5σ. For B s → e ± µ ∓ , on the other hand, the branching ratio for SM is negligible due to the absence of lepton flavor violation. The current experimental bound is [33] BR(B s → e ± µ ∓ ) exp < 5.4(6.3) × 10 −9 (90% (95%) C.L.) , where Note that BR(B s → e ± µ ∓ ) should include separately both BR(B s → e − µ + ) and BR(B s → e + µ − ), and in the ALP case, BR(B s → e − µ + ) = BR(B s → e + µ − ) since the hermicity gives |(g L,R ) 12 | = |(g L,R ) 21 |.

Muon anomalous magnetic moment
The discrepancy between the current experimental result and the SM prediction is [54][55][56][57][58][59][60][61] ∆a µ = a exp µ − a SM µ = 261(63)(48) × 10 −11 , where the numbers in the parentheses stand for the errors coming from a exp µ and a SM µ , respectively. The current deviation is about 3.3σ. 2 Therefore, if the anomaly is true, the new physics contribution should be positive to explain. However, it is well known that when there are only flavor diagonal couplings to ALP in the lepton sector, namely (g L,R ) ij = 0 (i = j), they can never explain the deviation by the one-loop contribution. Recently, it has been pointed out [11] that ALPs can explain the deviation when we take into account the contribution from nonzero flavor off-diagonal elements, (g L,R ) ij = 0.
As in Ref. [11], when we set (g A ) 22 /f = −10 −4 /TeV and c eµ /f 10/TeV with m a 0.12-0.15 GeV, we find a parameter space to account for the deviation in ∆a µ without conflicting with several experimental bounds from lepton flavor violating processes. Furthermore, there arises also a contribution from the BZ type loop involving the a-γ-γ coupling, C eff γγ in Eq. (A2). Therefore, we consider all these contributions and try to find the parameter space which explains the (g − 2) µ anomaly. Both two loop functions are available in Ref. [11,19].
(A19) 6. Limits on the ALP coupling to diphoton In addition to flavor limits, the a-γ-γ coupling C eff γγ for the ALP mass around 140 MeV is bounded as seen from (A20) Note that the upper bounds in the right inequalities are the conservative limits from LEP searches [15,[17][18][19] (see also the main text).