Belle II observation prospects for axion-like particle production from B meson annihilation decay

We investigate a new production mechanism of axion-like particle (ALP) from B meson annihilation decays and its observation potential at the Belle and Belle II experiments. This mechanism allows for the production of ALP from B meson decays in association with a large variety of mesons. In this article, we ﬁrst estimate the branching ratios of such processes with a perturbative QCD method. Focussing on the most promising B → ha (cid:48) ( h = K ± , π ± , D 0 and D s ) channels, we perform sensitivity studies for a (cid:48) decaying invisibly or into diphoton with Belle and Belle II experiments


Introduction
It has been a long time since the axion particle has been introduced by Peccei and Quinn as a solution to the strong CP problem [1], which questions why the CP violation allowed in the Standard Model (SM) is extremely suppressed in nature as seen from the non-observation of the neutron electric dipole moment.When we consider this suppression as a result of the underlying global U (1) symmetry, the axion particle emerges as the Nambu-Goldstone boson of the spontaneous breaking of this U (1) symmetry.In this article, we investigate the so-called axion-like particle (ALP), a generic name for pseudoscalar particles at a GeV scale, much heavier than the axion in the original model [2]- [17].The purpose of this article is to investigate a novel ALP production mechanism, B meson annihilation decay, and its search potential at the Belle (II) experiment.
In refs [18]- [24], it has already been shown that the Belle (II) experiment can provide a unique probe for ALP searches.ALP searches are being intensively pursued, especially in the early physics program of Belle II, the specialised study before the SuperKEKB accelerator reaches luminosities high enough to perform its flavour physics programs, e.g. with B, D mesons or τ leptons.For example, the e + e − → 3γ process where ALP could be produced through a a γγ coupling, a specialised trigger was set up [25].Taking advantage of its clean environment of the e + e − collider, Belle (II) can perform very unique searches for ALP which decay into γγ or nothing, i.e. invisible decay, even with existing data sets.Thus, we focus on these decay channels in this article.
In the following, we investigate the novel production mechanism of ALP from the B meson annihilation process.The annihilation process occurs as the constituent quarks of B mesons (anti-bottom and up quark for B + meson and anti-bottom and down quark for B 0 meson) annihilate via the four-fermion interaction and produce hadrons in the final state.On one hand, the observations of pure annihilation processes, B 0 → K + K − (5.2σ significance) [26,27] or B s → π + π − (7.0σ significance) [28], prove the existence of such process.On the other hand, annihilation processes have played important roles in B physics, such as the isospin violation of the B → ργ process [29]- [32] or the strong CP phase generated by the annihilation being the important source of the direct CP violation in the charmless B decays [33,34].In this paper, we investigate the phenomenology of the annihilation process in ALP search at Belle (II) experiment.The mechanism of producing ALP we consider here is that the ALP is emitted from one of the quark lines of the four-fermion interaction, as shown in Fig 1 .ALP production from B mesons has been discussed in many articles but they mainly consider ALP production from the quarks in the loop (typically, top quark) or from the W boson. Thus, the final states are limited to B → K ( * ) a or B → π(ρ)a .The advantage of the annihilation mechanism is that the ALP can be produced associated with many other hadrons, such as D, D * , D s • as well as charmonium, J/ψ, η c • • • , which opens more channels to explore at Belle (II).Furthermore, it can occur from tree level process, whose branching ratio, in general, is larger than the loop level penguin process.
The annihilation diagram is calculable using either the so-called pQCD method [35] or the QCD factorisation method [36].The computation is very close to the annihilation contributions to the radiative B → ργ decay [31,32].In these approaches, the initial and the final state meson distribution amplitudes are convoluted with a hard kernel, which, in this case, results in the a emission.The hard kernel includes the propagators of the quarks, which are between the a emission and the four-fermion interaction.Although the annihilation diagram is a 1/m b contribution and is generally suppressed, when there is a light quark propagator (i.e.ALP emission from light quark), it is enhanced by a factor of 1/Λ QCD .In addition, the ALP-quark coupling is proportional to the quark mass in the models where the ALP-fermion coupling is induced by derivative and this effect could compensate the 1/m b suppression factor for the heavy quark cases.Thus, we will consider emission of ALP from all possible quarks in B decay, i.e. u, d, s, c, b and perform the sensitivity study for the best observation channels.
The article is organised as follows.In section 2, we describe our theoretical framework for ALP interactions, their couplings and the mass range we are interested in.In section 3, we review the computation of ALP production from the B meson annihilation diagram using the pQCD method.In section 4, we present our sensitivity study at Belle (II) experiment and we conclude in section 5.

Annihilation diagram computation in pQCD 2.1 ALP production from annihilation diagrams
In this subsection, we show all the Feynman diagrams that can produce ALP from annihilation B decays.We first categorise the final states by the corresponding CKM matrix elements as well as the colour-allowed tree, colour-suppressed tree or penguin topologies.For the purpose of this section, it is enough to consider only the dominant contributions.That is, B → Ka and B → πa have both penguin and tree contributions but they are, respectively, considered as the penguin and the tree processes.Similarly, the charmonium final state is considered as a tree process.The result, taking into account diagrams up to λ 3 order, is listed on the Table 1.Naively we expect larger branching ratios for the tree diagrams.Among them, we investigate those with Cabbibo allowed colour suppressed tree processes, and Cabbibo suppressed colour allowed tree processes in the following, i.e.D 0 , D * 0 , D s , D * s , π + , ρ + final states.In addition, we also include Cabbibo allowed penguin processes, i.e.K + , K * + , K 0 , K * 0 , as their experimental sensitivity is known to be quite high.The charmonium production is suppressed with respect to the two tree processes mentioned above but it can be interesting for future study.It should be mentioned that the pQCD method is not the most suitable for charmonium production and we would need to perform an independent theoretical investigation for these processes.

Computation of the ALPs production from the annihilation diagram in pQCD method
In this section, using the B 0 → D 0 a process as an example, we demonstrate how to compute ALP emission from the annihilation diagram in the pQCD method.
Let us start with the weak Hamiltonian: where G F is the Fermi constant and the V qq is the CKM matrix element.
where (ψ i ψ j ) V −A = ψ i γ µ (1 − γ 5 )ψ j and (i, j) are the colour indices.For annihilation decay, the initial b quark and d quark must be in the same current.Thus, we use the Fiertz transformed operators Then, the amplitude is given as where Using the definitions of decay constants, distribution functions and form factors given in Appendix A, as well as the axion coupling defined as we obtain the amplitudes for B 0 → D 0 a : where the Ba form factors (f 1,2 ) and Da form factors (f 3,4 ) are computed as a convolution of the distribution function of the B and D mesons and the hard kernel thst represents the a emission.* The results yield where x i and b i are the parameters describing the momentum of the light quarks insides of the B and D mesons (i = 1 for B and i = 2 for D, find notations e.g. in [32,33]).The * The K 0 and H 0 terms represent the Bessel and the Hankel functions, which appear by the Fournier transformation of the k ⊥ of the spectator to the impact parameter b, which leads to Table 1: The pQCD computation result of Br(B → ha )g 2 q of the B → ha processes (where up down strange charm bottom D 0 0.270 3.700 0.000 0.062 0.004 D * 0 0.155 3.462 0.000 0.011 0.000 D s 3.573 0.000 0.305 0.129 0.002 D * s 3.403 0.000 0.066 0.025 0.000 K ± 2.860 0.000 1.659 0.000 0.001 K * ± 0.765 0.000 1.370 0.000 0.000 K 0 0.000 2.220 1.510 0.000 0.0012 K * 0 0.000 1.090 1.590 0.000 0.000 π ± 2.080 0.320 0.000 0.000 0.000025 ρ ± 2.810 0.175 0.000 0.000 0.000062 integration of x and |b| must be taken in the range where X represents the arguments of the Bessel functions, K 0 , H 0 , in Eqs.(8)(9)(10)(11), i.e.
As is well known, the a 1 value depends strongly on the renormalisation scale, t, e.g. from ∼ −0.32 to ∼ +0.06 for t = 1 − 5 GeV.In pQCD, t varies according the the momentum fraction carried by the light quark in the mesons.
Equations (8)(9)(10)(11) show that this process is sensitive to the g u,d,c,b couplings.As mentioned in the introduction, the dominant contribution comes from the a emission from the spectator, namely the f D 0 2 term.This is because the distribution function of the B meson peaks at smaller value of x 1 while the Bessel function K 0 is suppressed at a higher value of x 1 .
The amplitudes for the Cabibbo allowed tree processes (D ( * )− , D ( * ) s ) are computed in a similar way, with Wilson coefficient a 1 (t), instead of a 2 (t), which varies more moderately from ∼ 1.2 to ∼ 1.0 for t = 1 − 5 GeV.These processes are sensitive to the g u,d,c,b and g u,s,c,b couplings, respectively.The (π ± /ρ ± , K ( * )0 , K ( * )− ) processes come from both penguin and tree diagrams and receive contributions from a 2 (t), a 4 (t), a 6 (t).The computation for B → Ka is given in Appendix B as an example.They are sensitive to the g u,d,b , g u,s,b and g d,s,b couplings, respectively.

Which final states are to be searched at Belle (II)?
Now let us see which processes are more sensitive to each g i coupling.The obtained results are presented in Table 1.
Let us consider different g q = 0 scenarios and look into each cases to decide which channels are most promising for the Belle II experiment.
g u = 0: In this case, the D s , D * s , K ± , ρ ± , π ± channels obtain equally large contributions.The channel with a D * s is more challenging to reconstruct as it decays via D * s → D s γ with a relatively low energy photon (over 90% branching ratio).The ρ ± decays into two pions, with one of them neutral and its experimental measurement suffers from much more background due to the broad width.Thus, the best channels would be B → (D s , K ± , π ± )a .g d = 0: In this case, D 0 and D * 0 obtain large contributions.However, the channel with a D * 0 decaying to D 0 π 0 or D 0 γ makes it more challenging than D 0 to be observed.So the best channel would be the B → D 0 a for this case.
g s = 0: In this case, K + , K * + , K 0 , K * 0 obtain the largest contributions.Experimentally the best option would be the B → K ± a channel.
g c = 0: In this case, the best channel seems to be B → D s a .

Belle (II) sensitivity study
In this section, we perform a Monte Carlo (MC) study to estimate the sensitivity of the ALP search with the Belle and Belle II experiments at the KEKB/superKEKb e + e − colliders [37]- [40].e + e − colliders have an advantage over hadron machines, for missing energy and photon final states.Thus, we concentrate on the ALP decays, a → invisible and a → γγ in this section.In the following, we perform a sensitivity study based on Belle MC, which can be extrapolated to Belle II.
The Belle detector [37] located at the interaction point (IP) of the KEKB collider, which operated at a centre-of-mass energy of 10.58 GeV.From 1999 to 2010, the Belle experiment collected an integrated luminosity of about 711 fb −1 .The detector [37] consisted of concentric cylindrical subdetectors; a silicon vertex detector, a central drift chamber, particle ID detectors, an electromagnetic calorimeter (ECL), and a K 0 L − muon detector.In this work, the Belle detector geometry and response is simulated with Geant3 [41].
The decay channel of B → ha , where h is π, K, D 0 , D s , are studied.The MC samples of B ± → π ± a , K ± a , D ± s a and B 0 → D 0 a are generated using EvtGen [42].The a is forced to promptly decay to neutrinos (to mimic a generic final state), or to a di-photon final state.The signal samples contain 1 million events for 33 different ALP mass hypotheses for B → D, D s and 45 mass hypotheses for B → π, K processes due to the larger allowable mass range.
The background samples used in this work are MC samples produced by the Belle collaboration.They contain e + e − → B + B − , B 0 B0 (both at 10× the Belle integrated luminosity), and q q (at 6× the Belle integrated luminosity) where q = u, d, s, or c.There is also a sample with "rare" B decays produced with 25 times of the Belle integrated luminosity.This sample contains B decays that despite having small or poorly measured branching fractions, could be important background to consider, e.g.B + → π + K 0 .

Invisible decay of ALP
Events are reconstructed by first converting Belle MC to a Belle II software (basf2) [43] readable format using b2bii [44].The signal-side B-meson (B sig ) is reconstructed by reconstructing the signal-side hadron (π, K, D s or D 0 ).Charged particles (pions and kaons) are required to have their point of closest approach to the interaction point (IP) less than 2 cm and 3 cm in the directions transverse and parallel to the beam direction, respectively, as well as meet some particle identification requirements and have lab-frame momentum greater than 200 MeV/c.The B → D s and D channels use further selection on the invariant masses of combined tracks to find D meson candidates, and a vertex fit is done on the D candidate.
The other B meson (B tag ) in the event is reconstructed with the Full Event Interpretation algorithm (FEI), which uses over 10, 000 channels We refer to the remaining tracks and calorimeter clusters not used in the reconstruction as the Rest of Event (ROE).
The B tag is required to have a beam-energy-constrained mass of more than 5.27 GeV/c 2 and FEI must return a signal probability of greater than 0.005.The total unused ECL energy must be less than 0.4 GeV.Two multivariate classifiers are trained to separate the signal process from q q and non-signal B events respectively.These are trained on variables that describe the shape of the events.
The search method looks for resonances in the signal-side hadron's B sig -frame momentum, which should peak due to two-body kinematics.The signal-side B frame momentum is calculated by subtracting the four vectors of tag-side and ROE from that of the e + e − beam.
The signal peak is fitted for each generated mass hypothesis using a convolution of two Gaussians.The free parameters of this probability density function (PDF) are then parameterised as a function of the mean of the Gaussians in order to have a well-defined signal PDF anywhere in the search range.The background distribution is parameterised using a kernel density estimator [46].
A combined fit is performed over background-only MC, with only the yield of the signal and background PDFs allowed to float, and a 90% confidence interval on the number of signal events yielded by the fit is calculated.We use this upper limit to estimate the branching fraction sensitivity of the analysis.Our result is shown in Fig. 2. The largest discrepancy in the sensitivity between MC and real data is driven by the difference in the FEI performance between MC and data.The efficiency of the FEI can be 30% less on data compared to MC.The sensitivity limits shown here include a correction factor for the FEI efficiency derived from an analysis of B ± → K ± J/ψ, J/ψ → + − decays where ∈ e, µ.

Di-photon decay of ALP
In the a → γγ study, signal-side is fully reconstructed by combining a hadron h = π, K, D 0 , D s and a combination of two photons.All charged particles are required to have distance from the e + e − IP smaller than 0.2 cm in the plane transverse to the beams and smaller than 1.0 cm along the beam direction.The binary ratio L(K/π) ≡ L(K) L(K)+L(π) , is used to identify the species of charged particles, where L(h) is the likelihood for a kaon or pion to produce the signals observed in the detectors.Charged particles with L(K/π) > 0.6 are identified as kaons and those with L(π/K) > 0.6 as pions.The photons need to meet the criteria of energy E > 0.05 GeV.The π 0 and ALP candidates are formed from the combination of two photons, the former within the invariant mass range of [0.125, 0.140] GeV/c 2 , corresponding to ±2.5σ away fron the nominal π 0 mass.To suppress peaking backgrounds originating from π 0 and η in the mass spectrum of ALP candidates, we veto the di-photon mass ranges of [0.120, 0.160] GeV/c 2 and [0.520, 0.590] GeV/c 2 .An additional requirement on the π 0 momentum to be larger than 0.5 GeV/c is implemented.The D mesons are reconstructed from the final states K ∓ π ± , K ∓ π ± π 0 , and K ± π ∓ π ± π ∓ within the invariant mass ranges of [1.845, 1.885] GeV/c 2 , [1.833, 1.890] GeV/c 2 , and [1.850, 1.880] GeV/c 2 , respectively.The D ± s mesons are reconstructed in the channel K ± K ∓ π ∓ with an invariant mass requirement of [1.950, 1.990] GeV/c 2 .These ranges are approximately 3σ on either side of the nominal mass.The D and D ± s momenta are recalculated from a fit of the momenta of its decay products that constrains them to a common origin and their mass to the nominal mass of the D and D ± s , respectively.The B mesons candidates are selected to have a beam-energyconstrained mass larger than 5.27 GeV/c 2 and energy difference in the range of [−0.4,0.4] GeV.
To suppress the background originating from e + e − → q q events, a boosted decision tree (BDT) classifier is used to veto candidates from continuum events, training them on equal numbers of simulated signal and continuum.Differing from the invisible search, for a → γγ, the nominal approach is to scan the resonances directly on the invariant mass spectrum of the two photons.
The signal PDFs are determined by using simulated signal samples with different mass hypotheses.Generally, a combination of two gaussians and one asymmetric gaussian is fitted to describe the signal shape, which is well-defined.The background is described by a second order polynomial function.
An unbinned maximum likelihood fit is performed on the background-only simulated sample.During the fit, the background yield is the only parameter allowed to float and a 90% confidence interval is calculated which are used to estimate the branching fraction upper limit.

Observation potentials at Belle and Belle II
In this section, we discuss the observation potential of ALP produced from B meson annihilation decays at Belle and Belle II.In Figs 4 and 5, we overlay the achievable upper limit of the branching ratios for each channel at Belle and Belle II (as obtained in the previous subsections) and pQCD predictions of the branching ratios (as obtained in Section 2).The Belle II upper limits show in the Figs 4 and 5 are the extrapolation results from Belle.
Fig 4 shows that for the invisible final state with the Belle dataset, the ALP that couples to up quarks with g u = 1 are in the observable range for B → K ± a and B → π ± a channels.Figure 4: Belle and Belle II sensitivity compared with the theoretical prediction for B → h(a → invisible).The exclusion regions in blue, yellow, green and red, are those for B → ha with h = K ± , π ± , D 0 , D s , respectively.The dash line is simulated Belle sensitivity; dotted and dash dotted lines are the extrapolation of 5 ab −1 and 10 ab −1 , respectively.The coloured solid lines are theory predictions with g i = 1 (i = u, d, s, c).The filled grep bands are the veto ranges of B → K 0 π and B → D ( * )0 π, respectively.
The strange type model, g s = 1, can also be accessed with the B → K ± a channel with Belle II in a few years time (i.e. 10 ab −1 ).The models with non-zero down type coupling, g d = 1, might be accessible with the full Belle II dataset via the B → D 0 a channel.The B → D 0 a and B → D s a channels are sensitive to the models where ALP couples to charm quark, though g c = 1 is too small to be observed at Belle (II) unless there are some new physics model that allow g c to be enlarged.
From Fig 5, a similar conclusion can be drawn for the di-photon final state: the ALP that couples to up or strange quarks with g u,s = 1 can already be observed in B → K ± a channel at Belle and the down type model, g d = 1, especially with lower mass of ALP, may be accessible in the B → D 0 a or B → π ± a channels with Belle II.For the models where ALP couples to a charm quark, the best channel is B → D s a decay but again, a model that allows a large g c is needed.The exclusion regions in blue, yellow, green and red, are those for B → ha with h = K ± , π ± , D 0 , D s , respectively.The dash line is simulated Belle sensitivity; The dotted and dash dotted lines are the extrapolation of 5 ab −1 and 10 ab −1 , respectively.The coloured solid lines are theory predictions with g i = 1 (i = u, d, s, c).The filled grep bands in (b) are the veto ranges of π 0 → 2γ and η → 2γ, respectively.

Conclusions
In this article, we investigated ALP production from the weak annihilation transition in B meson decays.First, we have shown that a large variations of the final states is possible from this new mechanism, i.e.B → ha , where h = D 0( * ) , K ±( * ) , K 0,( * ) , D We performed a pQCD based theoretical computation to predict the branching ratios of these processes.It is found that assuming that the ALP-fermion coupling to be g i = 1, the branching ratios are roughly 10 −6 for the models where ALP couples to the light quarks (i.e.up, down, strange), 10 −7 for the charm quark, and 10 −9 for the bottom quark.The suppression for the bottom quark is the well known 1/m b effect.Based on this computation, we conclude that the four channels with large branching ratios, B → ha with h = D 0 , K ± , D s , π ± , and we next performed sensitivity studies for these channels with Belle (II).
We focused on the ALP decays into invisible or di-photon final states, which are the strength of the Belle (II) experiment.In order to obtain the achievable upper limit on the branching ratios, we performed detailed sensitivity study using Belle MC.In both final states, the upper limit of the branching ratios for the K ± and π ± channels are an order of magnitude better than the ones for the D 0 and D s channels.This is partially because the latter further decays weakly and only a part of their decay channels can be considered as the rest of the decay channels are higher multiplicity channels and difficult to analyse (i.e.much higher background).The K ± and π ± channels are sensitive to the models where ALP couples to up, down or strange quarks.Thus, we found that those models could be observed at the Belle (II) experiment.On the other hand, although our theoretical prediction shows that the D 0 and D s final states are the most promising channels for the models with charm or bottom quarks, the predicted branching ratio is too small to be observed at Belle (II).However, these channels are still interesting to investigate for the models where the coupling of quarks and ALP are induced by the derivative coupling, for which the coupling is proportional to the quark masses and is enhanced for heavy quarks.
The longitudinal and transverse polarisations of the D * can be given, respectively, as We use the following spin projection and the wave function to describe the B and D ( * ) mesons: where v is the unit vector in direction of p K and n − is the opposite direction.The meson distribution amplitudes are with N B = 92 GeV, ω B = 0.4 GeV, C D ( * ) = 0.7.This wave function is normalised to match to the definitions of the decay constants and their derivatives introduced earlier: The Weak Hamiltonian which gives the B − → K − a process is where the operators O i are defines as In order to have the initial b quark and u quark be in the same current (see Fig.
The amplitudes for the APLs emission from the initial b and u and the final s and u in Fig. 1 are given, respectively

Figure 1 :
Figure 1: ALP production from B meson annihilation decays: ALP can be produced at any of the quark line (cross mark), depending on the ALP coupling to different type of quarks.In this figure, we categorise the annihilation diagrams by the Cabbibo factors as well as colour allowed/suppressed tree or penguin topologies.
The indices a and b indicate ALP emission in Fig 1 from the initial b, d and the final c, u quarks, respectively.

Figure 2 :
Figure 2: Expected upper limit for B → h(a → invisible) at Belle experiment, where h is π, K, D 0 , D s .The filled grep bands are the veto ranges of B → K 0 π and B → D ( * )0 π, respectively.
Fig 3 shows the corresponding branching fraction upper limit result from Belle simulation.

Figure 5 :
Figure 5: Belle and Belle II sensitivity compares with the theoretical prediction for a → γγ.The exclusion regions in blue, yellow, green and red, are those for B → ha with h = K ± , π ± , D 0 , D s , respectively.The dash line is simulated Belle sensitivity; The dotted and dash dotted lines are the extrapolation of 5 ab −1 and 10 ab −1 , respectively.The coloured solid lines are theory predictions with g i = 1 (i = u, d, s, c).The filled grep bands in (b) are the veto ranges of π 0 → 2γ and η → 2γ, respectively.

3 , 9 ∝
1), O 3,10 must be Fiertz transformed in the Dirac space.O F (u j b i ) V −A (s i u j ) V −A , O F 4,10 ∝ (u j b j ) V −A (s i u i ) V −A O F 5,7 ∝ −2(u j b i ) S−P (s i u j ) S+P , O F 6,8 ∝ −2(u j b j ) S−P (s i u i ) S+P(47)where (ψψ ) S±P = ψ(1 ± γ 5 )ψ After applying the Fiertz transformation to the colour space, we find the combination of the Wilson coefficients a 2 O 2 , a 4 O F 4 and a 6 O F 6 wherea 4 = C 4 + C 3 /N C + C 10 + C 9 /N C , a 6 = C 6 + C 5 /N C + C 8 + C 7 /N C .The amplitude of B − → K − a can be obtained by sandwiching this Hamiltonian by the initial and the final states: