Muon g-2 and rare top decays in up-type specific variant axion models

The invisible variant axion models (VAM's) offer a very attractive solution for the strong CP problem without the domain wall problem. We consider the up-type specific variant axion models and examine their compatibility with the muon $g-2$ anomaly and the constraints from lepton flavor universality, several flavor observables, and top quark measurements. We find that the combined $\chi^2$ fit favors the parameters $m_A\sim 15$ GeV and $\tan\beta \sim 40$, the same as the type-X 2HDM. Moreover, we find that there are no conflict with any flavor observables as long as the mixing angle $\rho_u$ is sufficiently small. In particular, a small nonzero mixing angle $\rho_u \sim \pi/100$ is slightly favored by the observed $B_s \to \mu\mu$ branching ratio. The up-specific VAM predicts the flavor-violating top rare decay $t\to uA$ followed by $A \to \tau\tau$, which would provide a smoking gun signature at the LHC. We show that current searches of $A$ already impose some constraints on the parameter space but are not sensitive to the most interesting light $m_A$ region. We propose an efficient search strategy that employs di-tau tagging using jet substructure information, and demonstrate that it can enhance the sensitivity on $BR(t \to uA)$, especially in the light $m_A$ region. This model also predicts the flavor-violating decay of heavy Higgs bosons, such as $H \to t u$, that would suppress the $H \to \tau\tau/\mu\mu$ decays. We also examine the up-specific VAM with the muon-specific lepton sector and the down-type specific VAM's as interesting alternative scenarios.


I. INTRODUCTION
The strong CP problem of an empirically tiny CP-violating phase in QCD, θ QCD , can be solved by employing a U (1) Peccei-Quinn (PQ) symmetry [1], with which one can rotate away the undesired phase. Such a U (1) P Q is assumed to be anomalous and broken spontaneously, resulting in the existence of a pseudo Nambu-Goldstone boson, called the axion [2,3], whose dynamics is characterized by the axion decay constant f a . Such a model is subject to various experimental constraints. Axion helioscopes and astronomical observations give a lower bound of f a > ∼ 10 9 GeV (see, for example, Ref. [4]). On the other hand, coherent oscillations of the axion field can play the role of a cold dark matter in the Universe [5][6][7], from which one has f a ∼ 10 11−12 GeV [8], provided that the axion is the dominant component of the dark matter. However, the model has a serious problem of domain wall formation in the early Universe. This is because the number of discrete vacua separated by the domain walls is related to the number of fermion generations, which is 3 in the standard model (SM). Such a problem can be resolved by assuming that only one right-handed (RH) quark is charged under the PQ symmetry, thus rendering a unique vacuum [9,10].
Consistency of the axion model requires the use of two Higgs doublet fields Φ 1 and Φ 2 , with Φ 1 charged under the PQ symmetry while Φ 2 being neutral. Such an arrangement of assigning PQ charges to one Higgs doublet field and one RH quark would lead to flavorchanging neutral scalar (FCNS) couplings in the quark sector [11]. Depending on whether the RH quark belongs to the up or down sector, the FCNS interactions could respectively happen among the up-or down-type quarks. For example, a top-specific variant axion model (VAM) has recently been studied in Refs. [12,13]. Since such FCNS couplings depend on the chirality of fermions, the VAM presents a different Yukawa structure from, for example, the common two-Higgs doublet models (2HDM's). As the SM lepton sector is irrelevant to the above-mentioned domain wall problem, one has the freedom of assigning either zero or non-zero PQ charge to the leptons.
In general, we have six possible choices of assigning a non-zero PQ charge to one of the RH quark fields: u R , c R , t R , d R , s R and b R . In this work, we mainly discuss a scenario in which one of the up-type RH quark fields is charged under U (1) P Q . Depending on the mixing parameters, one can obtain as a special case the top-specific VAM examined in Refs. [12,13]. We will show that the mixing parameters are constrained under various experimental constraints such as the Higgs signal strength data and neutral D meson mixing, and left with three possible regions corresponding to up-specific, charm-specific and topspecific scenarios. The scenario of having one of the down-type RH quark fields charged under U (1) P Q , as we will show, is more severely constrained by low-energy flavor physics data.
Motivated by the long-standing puzzle of a 3σ-level deviation [14] in the muon anomalous magnetic dipole moment (a µ or (g − 2) µ ) from the SM, we find it advantageous to make all leptons charged under the PQ symmetry as well in the model. In fact, as far as the lepton sector and the third-generation quarks are concerned, the up-specific and charmspecific models become effectively identical to the usual Type-X 2HDM. Since the up or charm Yukawa coupling is too small to affect the direct search constraints of additional Higgs bosons in collider experiments, which are mainly determined by the third-generation Yukawa couplings, most of the same constraints on Type-X 2HDM can be directly applied here. In particular, it is known that Type-X 2HDM is difficult to constrain at the LHC and that it can explain the muon g − 2 deviation by taking large tan β values of ∼ 40 − 50 and a light pseudo-scalar Higgs boson A with m A ∼ 20 GeV [15]. One of the efforts to constrain such a model at the LHC can be found in Ref. [16]. We will see that the up-specific model shares the same parameter set to explain the (g − 2) µ deviation, while the charm-specific model is not favored as the two-loop contribution to (g − 2) µ is not negligible and has the opposite sign. Note that the requirements of large tan β and perturbativity of the top Yukawa coupling prohibit us from assigning a non-zero PQ charge to the RH top quark, as examined in Ref. [13].
In the up/charm-specific model with the above-mentioned setup, an interesting rare top decay t → u/cA followed by A → τ τ or µµ is predicted. Even though there is no dedicated experimental study focusing on this process, we find that searches for bbA production followed by A → τ τ (or µµ) already constrains the parameter space of this model. Nevertheless, a dedicated search of the t → u/cA rare decay would still provide a better sensitivity to this model. We will propose an efficient strategy using the tau-tag algorithm with the jet substructure information and show that the sensitivity would be much enhanced. In this model, the heavy Higgs bosons also have flavor-violating decay modes. Those flavor-violating processes would provide smoking-gun signatures of the model at the LHC.
As a solution to the domain wall problem, we have more freedom in assigning the PQ charges in the lepton sector. If only µ is PQ charged, the lepton sector becomes identical to the so-called muon-specific 2HDM, which is shown to successfully accommodate (g − 2) µ without relying on the 2-loop contribution with a light A boson [17]. We will see that the up-specific VAM with the muon-specific lepton sector is another attractive possibility as it is not constrained by the lepton universality measurements and no tuning is required to suppress h → AA, thanks to the absence of such light particles. Unlike the original muonspecific model, the up-specific VAM with the muon-specific sector predicts that the heavy Higgs bosons can decay into a pair of flavor-violating up-type quarks such as H/A → tu at a significant branching fraction. It thus suppresses the H/A → µµ decay, making the 4µ constraint at the LHC less effective and opening up more parameter space.
This paper is organized as follows. Section II discusses the structure of the Higgs sector in the VAM and possible scenarios of PQ charge assignment to the quark fields. We work out the FCNS couplings of the SM-like Higgs boson to the SM fermions. Section III is devoted to studying possible effects of the FCNS interactions on physical observables. Current data such as the muon g−2 anomaly, the lepton universality in τ decays, the lepton universality in Z decays, rare B decays and D meson mixing, and top observables are imposed to constrain the mixing parameters in the Higgs sector. In Section IV, we focus on a promising signature of the model, namely the rare t → uA and t → cA decays. We show the current constraints from existing searches, and propose an effective way to look for the signature of this model by introducing boosted A → τ τ tagging using jet substructures. Section V discuss another interesting possibility of the charge assignments for the lepton sector and how the down-type VAM is severely constrained by data. We summarize our findings of this study in Section VI.

II. UP-TYPE SPECIFIC VARIANT AXION MODEL
In a minimal setup of the VAM, we have two Higgs doublet fields Φ 1 and Φ 2 and a scalar field σ with PQ charges −1, 0 and 1, respectively. 1 The σ field, a SM gauge singlet scalar, is introduced to break the PQ symmetry spontaneously at a high energy scale by acquiring a vacuum expectation value (VEV) f a while it does not play much a role at low energies. In general, one can choose any one of the six RH quark fields to be charged 1 Note that there is a difference in the convention between this work and Ref. [13]. The PQ-charged Higgs field is Φ 1 in the former case and Φ 2 in the latter.
under U (1) P Q . It is also all right for leptons to carry either zero or non-zero PQ charges.
We first argue that the requirements of accommodating the muon g − 2 anomaly and perturbativity in the Yukawa couplings largely restrict ourselves to only the scenario of uptype specific VAM's. Consider first the scenario where the leptons do not carry the PQ charge and therefore have to couple only with Φ 2 . In this case, the VEV's of the Higgs fields must satisfy the hierarchy: Φ 2 Φ 1 . In order to reproduce its mass, the RH top quark has to carry a non-zero PQ charge and therefore couple to Φ 1 . As far as the down-type quark sector and the lepton sector are concerned, the model is identical to the In the scenario where the leptons are charged under U (1) P Q , the VEV's must satisfy instead Φ 1 Φ 2 (corresponding to tan β 1) in order for the lepton Yukawa couplings to be sufficiently large to explain ∆a µ . In this case, t R cannot be the one carrying a non-zero PQ charge because the top Yukawa would become non-perturbative. We are then left with the choices of assigning a non-zero PQ charge to one of the remaining five RH quark fields.
In the following, we will formulate the up-type specific VAM's as an explicit example, and comment on stringent constraints on the down-type specific VAM's from low-energy flavor physics data.
Following the above argument, we assign a non-zero PQ charge of −1 to the RH up or charm quark field u R /c R . As explicitly shown below, these two possibilities are related by a rotation in the field space. The Yukawa interactions are given by: where the family indices a ∈ {2, 3}, and i, j ∈ {1, 2, 3}. Explicitly, Y u1,u2 assume the forms where * denotes a generally non-zero entry.
As in the 2HDM's, one can rotate the Higgs doublet fields into the Higgs basis: where the new Higgs fields The mass eigenstates of the CP-even neutral Higgs bosons, H and h with m h = 125 GeV < m H , are relative to h SM and h through a rotation: In this basis, the original Lagrangian is written as: where, Throughout this paper, we will use the shorthand notation: s θ = sin θ and c θ = cos θ. With the explicit forms in Eq. (2), we have The up-type quark mass matrix can be diagonalized via a bi-unitary transformation, In this mass basis, where and the non-trivial flavor structure among the u R , c R , and t R fields is encoded in the matrix H u . Assuming no additional CP phase for simplicity and without loss of generality, we can parametrize the mixing matrix V u with a u − t mixing angle ρ u and a u − c mixing angle ψ u as The first and second matrices on the right-hand side of the above equation represent respectively the mixing between u R and c R and the mixing between u R and t R . In the next section, we will see that both ρ u and ψ u are constrained to be close to 0 or π. We note that (ρ u , ψ u ) ≈ (0, 0) corresponds to a PQ-charged up quark, (ρ u , ψ u ) ≈ (0, π) to a PQ-charged charm quark, and (ρ u , ψ u ) ≈ (π, 0 or π) to a PQ-charged top quark. The last case is identical to the scenario discussed in Ref. [13], where tan β is restricted to moderate values and the muon g − 2 cannot be explained. With this parameterization, the explicit form of H u becomes The Yukawa Lagrangian can now be cast into a simpler form: where and where T 3 f denotes the SU (2) L eigenvalue of a fermion f (i.e., T 3 f = +1/2 for u, c, t, and −1/2 for d, s, b, e, µ, τ ). The FCNS interactions only appear in the up-quark sector, and the these interactions can be written as The pattern of the FCNS interactions of h, H is hence identical to that of A but suppressed by c β−α and s β−α , respectively. Although we parametrize the mixing matrix V u and H u using two mixing angles ρ u and ψ u , we will see in the next section that the u − c mixing angle ψ u receives stronger constraints than the u − t mixing angle ρ u from the D −D oscillation.
Therefore, it is reasonable to fix ψ u = 0 in the mixing matrix V u for other phenomenological studies. In this limit, we have

III. CONSTRAINTS
In this section we consider the current experimental constraints on this model. We examine first the constraints from low energy observables: muon g − 2, lepton universality in τ and Z decays, and the B s → µµ decay. We perform a χ 2 -fit to those four observables to find the preferred parameter region. We consider the observables one by one as follows.
Muon g − 2 The discrepancy between the experimental measurement [18] and the SM prediction of the muon anomalous magnetic moment [19] is a long-standing puzzle, and the deviation [20] ∆a µ = a EXP µ − a SM µ = (262 ± 85) × 10 −11 (17) is at about 3.1σ level. Additional 1-loop contributions from the h, H, A, H ± bosons in the VAM are the same as those in the usual 2HDM [15], and the sum is given by where G F is the Fermi decay constant, m µ is the muon mass, α em is the fine structure constant, r j f = m 2 f /m 2 j and the loop functions Note that the sign of each contribution in Eq. (18) where the loop functions are defined as The explicit values of the 1-loop contribution and the 2-loop Barr-Zee contributions for the case of m H = m A = 1 TeV are shown in Table III. To obtain ∆a µ from the table, one needs to multiply a common pre-factor of Note that only the t, c, u contributions for A change the sign while those of b, τ do not.
The sign of each contribution is determined by the corresponding ξ φ µµ ξ φ f f , which is proportional to ζ µµ ζ f f for the A contributions and for H in the aligned limit (c β−α = 0). The last column in the table summarizes the sign of each contribution δ H/A modulo ζ µµ ζ f f . For the parameter region of m A < m H = m ± H , the 1-loop contribution is always negative. Hence, a significant positive contribution is required from the 2-loop Barr-Zee diagrams. The Barr-Zee diagram contribution involving a light A is proportional to ζ f f = − tan β or cot β. With ζ µµ = − tan β, we see from the table that only the τ -loop offers a positive contribution among the tan 2 β enhanced contributions. With a large tan β enhancement, it dominates to compensate for the other negative 1-loop and 2-loop contributions. In the up-type VAM, the bottom loop contribution is negative but negligible since tan β is canceled by cot β. We find that the charm-specific VAM is not preferred as the charm loop contribution is negative and enhanced by tan 2 β, while the up-specific VAM is still viable as one can neglect the small up Yukawa coupling.
Moreover, any non-zero mixing between u and t, ρ u = 0, does not help explaining the muon g − 2 in the up-specific VAM because a non-zero ρ u always reduces ξ A tt and may turn the originally positive top-loop Barr-Zee contribution negative. Therefore, a large value of ρ u is not favored. This is also seen in the charm-specific VAM for the same reason. The parameter region consistent with the muon g − 2 observation is shown in purple in Fig. 1 for ρ u = 0, π/50, and π/20. (right). The 1σ and 2σ regions preferred by the muon g − 2 is drawn in dark and light purple, respectively [15]. The constraint of lepton universality in the τ [15] (Z [23]) decays is given by the green (red) curve, and the region above it is excluded at 95% confidence level (CL).

Lepton universality in τ, µ decays
The lepton decay processes τ → µνν, τ → eνν, and µ → eνν can be used to constrain this model, as they can be mediated at tree level by H ± in addition to the W ± bosons. This is true for any 2HDM's in general. There are also loop contributions mediated by h, H, A and H ± [15]. The Heavy Flavor Averaging Group (HFAG) gives constraints on the coupling ratios g /g , where , = e, µ, τ [21]. Here we quote the deviations of these ratios from their SM values,δ ≡ (g /g ) − 1:δ τ µ = 0.0011 ± 0.0015,δ τ e = 0.0029 ± 0.0015, and δ µe = 0.0018 ± 0.0014.
By neglecting the electron mass, we where the loop functions Taking correlations among the observables into account, we use the following three independent quantities for the χ 2 -fit [15] Both δ VAM tree and δ VAM loop are negative in the VAM, while the observed data prefer to have positive values. Thus, we can set an upper bound on tan β for a fixed value of m A . The 95% CL excluded region is overlaid as the green area in Fig. 1. Note that this constraint is independent of the quark sector and thus the quark mixing parameter ρ u .
Corrections due to the A, H, H ± loops in the VAM are found to be [23] δ VAM µµ 0 , and δ VAM

and the loop functions
The renormalization constant ∆ = 2/ − γ + ln(4π) from dimensional regularization will While the contributions from the VAM's are negative, the present data exhibit slightly larger values than the SM predictions. Therefore, the large tan β region with an enhancement in the Aτ τ coupling is disfavored. The region excluded at 95% CL is overlaid in the red region in Fig. 1. Note that this constraint is insensitive to the quark sector. From the above considerations, it is seen that small m A and large tan β are preferred in the up-specific VAM.
In the SM, the b → s flavor-changing neutral current (FCNC) processes are mediated by the loop diagrams with the W ± boson and top quark. In the VAM with non-zero quark mixing angle ρ u , the pseudoscalar A can couple to the top quark through the ξ A tt coupling. Since A is preferred to be light according to the discussions in the previous subsection, its contribution to B s → µ + µ − cannot be neglected. The time-integrated branching ratio BR(B 0 s → µ + µ − ) EXP averaged between the LHCb [24] and CMS [25] Collaborations normalized to the SM prediction is reported to be: The combined SM and VAM contribution to this observable is [26] where ∆Γ s = 0.081 ps −1 and Γ s L = 1/1.428 ps −1 are the decay width difference between the two B s mass eigenstates and the width of the lighter mass eigenstate, respectively. The pseudoscalar and scalar contributions are given by where C 10 , C S , and C P are the Wilson coefficients of the effective four-fermion operators O 10 , O S , and O P that include contributions from both SM and VAM, with the detailed expressions given in Ref. [26]. The measured value ofR sµ is slightly smaller at the 1σ level than the SM predictionR sµ = 1 out of P = 1 and S = 0.
In the up-specific VAM, the main contribution appears in P and one can neglect S. The contribution from the top-W loop diagrams with a pseudoscalar A propagator is proportional to −ξ A tt ξ A µµ . For ρ u = 0, the contribution is positive and independent of tan β, with ∆P ∼ 0.21 for m A = 15 GeV. As ρ u increases, it decreases to zero at ρ u 2/ tan β and eventually becomes negative. The bottom quark mediated diagrams with a pseudoscalar A propagator contribute negatively as ∆P ∼ −0.17 for m A = 15 GeV, which is independent of tan β as it is proportional to −ξ A bb ξ A µµ . In summary, a small but non-zero ρ u is preferred to fit the current B 0 s → µ + µ − data, which exhibits a small downward deviation ∆P ∼ −0.1. There exists another solution corresponding to ∆P ∼ −1.9.
(left plot) and ρ u = π/100 (right plot). Shapes of these contours depend sensitively on the value of ρ u . Fig. 3 shows the χ 2 -contours for all the above-mentioned observables taken into account, for ρ u = 0 (left plot) and ρ u = π/100 (right plot). The best fit point in (m A , tan β) plane is located around (15 GeV,40) for the whole relevant range of ρ u < ∼ 0.06, which is the consistent range to the top total width measurement as we will see in the next section.
Such a set of parameters is favored in comparison with the SM (χ 2 SM = 14.8) at about 1-2 σ level. The ρ u π/100 case would provide a best fit (χ 2 = 11.3) mainly due to the observed B s → µ + µ − decay branching fraction. In summary so far, the scenario of m A 15 GeV and tan β 40 with a small mixing angle ρ u is preferred by muon g − 2 without conflicts with the other observables in the up-specific VAM.
In the general case when the non-zero flavor violating parameters ψ u and ρ u are allowed, are proportional to ξ A ut ξ A ct due to A and t running in the loop. In the VAM, the D 0 oscillation imposes the following constraints [27] tree : loop : The constraints in the ρ u -ψ u plane for tan β = 40 and m A = 15 GeV are shown in Fig. 4, where the hatched green region is excluded at 95% CL. We neglect the interference effects.
The exclusion region generally depends on the value of tan β and c β−α , which is empirically found to be close to zero from 125-GeV Higgs coupling measurements. For example, with c β−α = 0.3 and tan β = 40, the constraint reads sin ρ u < ∼ 0.02. In the alignment limit c β−α = 0, however, these FCNS top decays vanish identically in the up-type VAM.
The t → Aj constraints from top quark total width A robust constraint on the FCNS couplings independent of c β−α can be obtained from the model-independent top width measurements [28], where the total top width is reported not to exceed 2.5 GeV at 95% CL. On the other hand, the VAM's predict the rare top decays of t → uA and cA. We demand that the total top decay width, including the standard partial width Γ t→bW = 1.41 GeV, be below the the upper bound; that is, Γ t,tot ≡ Γ t→bW +Γ t→uA/cA ≤

GeV with
The condition Γ t→uA/cA ≤ 1.1 GeV, corresponding to BR(t → uA/cA) < ∼ 40%, is translated to |ρ u | < ∼ 0.06 for tan β = 40 and m A = 15 GeV. Note that it currently provides the most stringent constraint on ρ u . The exclusion region is shown by the gray region in Fig. 4.
The h → AA decay When the A boson is lighter than m h /2, the decay of h → AA → 4τ is kinematically allowed. Although there is a CMS study constraining h → AA → 2τ 2b < ∼ 2 − 10% [29], we cannot find the constraint on h → AA → 4τ . Thus, we consider the branching ratio of exotic Higgs decays, which is currently bounded to BR(h → AA) < ∼ 20% [30]. Using the partial width formula we find that the trilinear scalar coupling should satisfy |λ hAA | < ∼ 3.7 GeV. In the leptonspecific 2HDM with large tan β, λ hAA can be expressed in terms of neutral Higgs boson masses and lepton Yukawa couplings as [16] λ hAA From the first two terms in the numerator, the natural size of λ hAA is about 60 GeV.
Therefore, O(10%) fine tuning is required for a cancellation between the first two terms and the third term in the numerator, in order to satisfy the experimental constraint on the h → AA decay.
Under the constraints of Higgs data: s β−α 1 and |ξ h τ τ | 1, we can consider two possible cases. The first case corresponds to the right-sign limit (ξ h τ τ → +1). However, in the large tan β limit, the bound of m H < 250 GeV is required by the perturbativity condition of the λ hAA coupling. The charged Higgs mass is then forced to be light due to the condition m H ± m H without conflicting with the electroweak precision measurements. Consequently, the region associated with the right-sign limit is ruled out due to the direct search of the charged Higgs boson at the LHC. The second case corresponds to the wrong-sign limit (ξ h τ τ → −1), where m H can be arbitrarily large and, therefore, λ hAA can be fine-tuned to vanish. The solution for an exact cancellation in λ hAA in the wrong-sign limit is for m H > ∼ 300 GeV. This is almost independent of m A provided m H m A , as is the case considered here.
Single A production at the LHC In the up-specific VAM, the A production at the LHC through the process pp → uū → A is enhanced by tan β. For example, for m A = 20 GeV, the production cross section at the 13-TeV LHC reaches It is, however, too small to detect through the di-muon resonance signal with the current integrated luminosity [31].
In this section, we have examined various current constraints on the up-type VAM. As a short summary, we have found that the up-specific VAM with m A ∼ 15 GeV and tan β ∼ 40 can provide a good explanation for the deviation found in (g − 2) µ , while the charm-specific VAM is not favored in this regard. There are no conflict with any flavor observables and experimental measurements considered in this section, as long as the mixing angles ρ u and ψ u are small. Moreover, taking non-zero ρ u at the order of π/100 offers a better fit to the B s → µµ branching fraction. This could lead to interesting phenomenology.

IV. SMOKING GUN SIGNATURES AT LHC
A. Rare top decay t → u + A In this section we focus on the up-specific VAM and first consider the most promising signature t → u + A at the LHC. The analysis shown in this section can be readily applied to the charm-specific case with t → c + A, although as seen in the previous section this case is slightly disfavored by the (g − 2) µ observation.
In the VAM with ψ u = 0, the t → u + A decay helicity amplitudes M ht,hu in the limit of m u = 0 are given by The angle θ is between the up-quark 3-momentum and the top spin and the up-quark energy which is not suppressed even in the alignment limit, in contrast to the t → uh partial decay width that is suppressed by c β−α . The branching ratio assuming Γ t→uA Γ t→bW is, with Therefore, for tan β = 40, a O(0.01) of ρ u would provide the O(%) branching ratio: (ρ u /0.01) 2 × 2.24%.

B. Current constraints from existing searches
As top quarks are copiously produced in pairs at the LHC, one should be able to constrain or search for this rare FCNS t → uA decay. Nevertheless, we cannot find a dedicated experimental study searching for this specific rare decay in the literature other than several theoretical studies [32][33][34]. We consider the main production mechanism of the pseudoscalar with several theoretical efforts being made to improve the sensitivity [38][39][40]. Among them we find that the CMS analysis at 8 TeV [36] currently provides the most stringent constraint.
The CMS has shown that the sensitivity using A → µµ is comparable but weaker than that using A → τ τ , assuming BR(A → τ τ )/BR(A → µµ) = (m τ /m µ ) 2 [37]. There are also searches using the di-tau channel on the 13-TeV data [41], although they focus on the case where the new bosons are heavy and only show a limit of m A = 300 GeV at lightest.
We follow the most stringent 8-TeV CMS analysis [36] and re-interpret it to constrain our model. For the signal analysis, we use MadGraph5+Pythia8 [42,43] for event generation and Delphes3 [44] for detector simulation. For jet reconstruction, we rely on the FastJet package [45] and use the anti-kT algorithm with the standard jet size R = 0.5. The data include three channels: eµ, eτ h , and µτ h , where τ h denotes a tau lepton decaying hadronically, and their respective selection cuts are summarized as follows: • µτ h channel: exactly one µ and one τ h with opposite charges: • eτ h channel: exactly one e and one τ h with opposite charges: where M T (p T , p / T )) = p T p / T (1 − cos φ) and φ is the azimuthal angle difference between the momentum and the missing transverse momentum. The definitions of P ζ , P vis ζ can be found in Ref. [36]. In addition to the above selection cuts, events in all the channels are required to have at least one b-tagged jet with p T,b > 20 GeV and |η b | < 2.4. For the hadronic τ tagging, we take a simpler algorithm described in Ref. [46] for the Cambridge/Aachen (C/A) jets with R = 0.5 and call a τ tag when the following conditions are satisfied: • define j core by drawing a cone with a smaller radius R = 0.1 centered at the jet, and require no tracks with p T,track > 1 GeV to lie in the annulus between 0.1 < R < 0.5; • the hardest track in j core satisfies p T > 5 GeV; and T , the fraction of jet energy deposited in the jet core.
For the selected events, we consider the possible values of m τ τ consistent with the kinematics of the two visible tau decay products, and take the minimum value as m τ τ . We can then set an upper limit on the cross section at 95 % CL. from the combined m τ τ distribution of all three channels.
There are two important differences between the A → τ τ decay from bbA production and that from the t → uA decay in tt production (in this section, we shall refer to them as bbA and tuA, respectively). One is that the p T,τ distribution is softer in the bbA sample when m A is small, whereas p T,τ in tuA sample is harder and practically independent of m A , especially in the tail. This is seen in the upper-left plot of Fig 5, where the normalized p T,τ distributions in the selected bbA and tuA samples are shown. The other is that the ∆R(τ 1 , τ 2 ) distribution is peaked at ∼ π in bbA samples whereas it is peaked at ∼ 0 for tuA, and more prominent when m A becomes smaller. The normalized ∆R τ τ distributions are shown in the upper-right plot of Fig. 5. As a result, the efficiencies of the selection cuts are rapidly falling as m A gets smaller in both cases of bbA and tuA but for different reasons.
The resulting acceptance and efficiency as a function of m A is shown in the lower-left plot of Fig. 5. As τ -tagging efficiency rapidly falls as p T,τ decreases, so is the lepton acceptance from the leptonic tau decay. The final efficiency for the bbA production diminishes when m A becomes small. On the other hand, the acceptance of the tuA channel is relatively high. For example, it is about 30 times higher than the bbA channel for m A = 30 GeV. Taking this efficiency difference into account, we can re-interpret and apply the constraints obtained for the bbA production to the tuA production. The resulting upper bound on the tuA production cross section at the 95 % CL is shown as a function of m A in the lower-right plot of Fig. 5. Assuming σ(tt) = 250 pb at LHC 8 TeV and BR(t → uA) is sufficiently small, we can translate the upper bound on the cross section to that on the branching ratio and obtain roughly BR(t → uA) < ∼ 0.3 % for m A ≥ 25 GeV. For m A < 25 GeV, the tuA acceptance is exponentially falling due to the ∆R > 0.5 cut. Based on our estimate, BR(t → uA) < ∼ 10 % would be allowed for m A = 15 GeV. Since the CMS analysis does not show the results for m A < 25 GeV, we apply in this range the same upper bound on the cross section times the acceptance given at m A = 25 GeV. We consider our results in that range as a conservative estimate because we expect smaller SM background contributions for the appropriate signal region.

C. Searches using di-tau tagging
In the previous section we have shown that the existing searches become insensitive for the light A region, of most interest to us for explaining the muon g − 2 in our model. In this section we propose an effective way to probe the region, and provide a rough estimate of the expected sensitivity using the current and future data at the LHC.
One of the main reasons for the sensitivity to present a sharp drop in the light m A region is due to the ∆R > 0.5 cut on the reconstructed leptons and taus. For a light A, it is boosted from the top decay with the decay products (a di-tau pair from A) being collimated.
They are difficult to discriminate and hence naturally captured as one object. We develop the di-tau tagging algorithm following the idea of mutual isolation proposed in the Ref. [47] as follows: • find a jet using the C/A algorithm with R = 0.5; • define two exclusive sub-jets j 1 and j 2 using calorimeter towers in the jet; • define tau-candidates j core 1 , j core 2 by drawing a cone of R core = 0.1 around each sub-jet; • for both i = 1, 2, require that once the activities (tracks and calorimeters) in j core i are removed, the remaining activity in the original jet satisfy the tau-tagging criteria, with f cut core = 0.9; and • the hardest tracks in j core denominator. First, the efficiency of both visible decay products from the taus in an A decay are captured in a jet is shown by the dotted curve as a function of m A , which rapidly drops as m A gets larger from 60% to 3%. The tagging efficiency of this algorithm using the number of jets capturing the two visible taus as the denominator is 7 ∼ 25 % depending on m A , as shown by the dashed curve. Finally the overall combined tagging efficiency for the signal ranges from 5% to 0.5% are shown by the red curve, while the mistagging efficiency for the non-tau jets is ∼ O(0.1) %. Another analysis quotes a similar di-tau tagging/mis-tag efficiency [48].
The main background after the appropriate preselection would be tt, as considered in this paper. A more dedicated analysis would require collaborations with experimental inputs.
The set of the preselection is as follows: • require that the event contains exactly one isolated lepton and at least three jets, with exactly one of them tagged as a b-jet and exactly one of them tagged with di-τ , j τ τ ; • m b < 150 GeV and m T ( , E / T ) < 100 GeV to guarantee that the event contains one standard top decay; and • 100 GeV < m jττ j 1 < 200 GeV to make sure j τ τ and j 1 are from the rare t → uA decay, where j 1 is the hardest non-b, non-di-τ jet. the corresponding H + → u RbR should be observed. Existence of these decays would offer a clear difference between the simple type-X 2HDM and the up-specific VAM. Note that when A is heavy, though loosing the motivation for explaning (g − 2) µ , the corresponding flavor-violating decay modes of A are also predicted. A modified model with the capacity to accommodate (g − 2) µ in the heavy A scenario will be discussed in the next section.
For m H m t and tan 1, we have Therefore, the flavor-violating decay H → tu dominates for ρ u > ∼ 1/120. Fig. 7 shows the branching ratios BR(H → f f ) as a function of ρ u for tan β = 40 and c β−α = 0. For example, BR(H → tu) reaches 90% for ρ u ∼ π/100. Since BR(H → τ τ ) is suppressed due to the new decay mode, the constraint from the searches for the heavy Higgs bosons via the τ τ mode could become much weaker in our scenario. It will be even more suppressed with the existence of the other decay modes involving the Higgs bosons, although we assume them negligible here.
Our model also predicts the helicities of top quark in the decay products from the heavy Higgses: the left-handed top or the right-handed anti-top should be observed. Confirming the existence of the H → tu decay and measuring the polarization of t in the decay products would be an important test for our model.

A. Muon-specific in lepton sector
So far, we have assigned the same non-zero PQ charge to charged leptons of all three generations, making their Yukawa couplings to A enhanced by tan β. In particular, the enhanced Aτ τ coupling is important to enhance the Barr-Zee diagram contributions to (g −2) µ while it is also constrained by the lepton universalities in heavy lepton and Z decays.
Thus, the mass of A is required to be lighter than 30 GeV as shown in Fig. 1, which forces us to fine-tune the hAA coupling to zero as h → AA decay is always kinematically allowed. For the lepton sector, however, we have more freedom in the PQ charge assignment as they are not relevant to the PQ-solution nor the domain wall problem. From the phenomenological point of view, we can assign a non-zero PQ charge only to µ and keep τ and e PQ-neutral.
We shall refer to the lepton sector in this scenario as the muon-specific lepton sector. It might be even more natural as we have a parallel setup to treat only one generation being special in both lepton and quark sectors. As a result, the Aτ τ and Hτ τ couplings are suppressed by cot β, and the constraints from the heavy lepton and the Z decays become irrelevant. In this case, the 1-loop contribution dominates over the 2-loop contribution of the cot β suppressed τ -loop Barr-Zee diagram for (g −2) µ [17]. Since among the 1-loop contributions only the one involving the CP-even H provides a positive contribution to (g − 2) µ , as explicitly seen in Table III, the preferred parameter region has to satisfy m H < m A . As all the contributions roughly scale like tan 2 β/m 2 φ , the required value of tan β/m φ to explain the deviation in (g − 2) µ is about 7 · 10 3 TeV −1 .
In the simple muon-specific model, the LHC data already set lower bounds on m H and m A as a plethora of 4µ events would be generated through pp → Z * → HA and both H and A can decay into a pair of muons with a branching ratio of almost 100% due to the tan β enhancement. The search with three or more muons had been performed by the CMS Collaboration using the 13-TeV data with 35.9 fb −1 and found the data consistent with the would be required to explain the (g − 2) µ deviation.
With the muon-specific lepton sector, only the up-specific VAM would be valid among the up-type VAM's as the tan β enhanced 2-loop contributions to (g − 2) µ are negative for the up-type quarks and only the up Yukawa coupling is small enough to neglect the effects.
As the mixing controlled by ρ u initiates the tan β enhanced 2-loop t-contribution, we cannot consider arbitrary large ρ u , as it is not theoretically favored by perturbativity with such a large tan β. We restrict the range of ρ u by requiring all Yukawa couplings be perturbative, and obtain ρ u < ∼ 20/ tan β ∼ 0.006 from Y ut = √ 2ξ ut m t /v < ∼ 4π. Within this range, the negative 2-loop top contribution is negligible.
As ρ u increases, the enhanced branching ratio of A/H → tu reduces and dominates over the decay of A/H → µµ for a good portion of the parameter space. Explicitly, the ratio of their branching ratios is and ranges from 0 to O(10 2 ). In this case, the above-mentioned 4µ constraint becomes weaker or invalid for a non-zero ρ u . It opens up an allowed region for lighter m A with the corresponding smaller tan β.

B. Assigning non-zero PQ charges to down-type RH quarks
Instead of assigning non-zero PQ charge to the RH up-type quarks, we can do the same to the RH down-type quarks without loosing the motivations. In this subsection, we briefly comment on how such a scenario is severely constrained by quark mixing in the down sector.
The mixing structure is analogous to the up-type specific VAM, and the mixing matrix V d would be the same in form as V u but with ρ u and ψ u replaced respectively by ρ d and ψ d to describe the d − b and d − s mixing. We also define H d in a way analogous to H u in Eq. (9) with the corresponding substitution. According to Table II of Ref. [27], the constraints from B 0 d , B 0 s and K 0 oscillations are: B 0 s : The region in the ρ d − ψ d plane allowed by the above constraints for m A = 15 GeV and tan β = 40 is shown by the white region in Fig. 9. Only the region nearby (ρ d , ψ d ) ∼ (0, 0) (down-specific VAM) is shown. The B 0 d constraint is important for ψ d = 0, giving ρ d , ρ d < ∼ 1.4 × (m A /10 4 GeV)/ tan β. The K 0 constraint is important for ρ d = 0, giving ψ d , ψ d < ∼ 9.8 × (m A /10 4 GeV)/ tan β. The B 0 s constraint is important for ψ d = π, giving ρ d , ρ d < ∼ 12.6 × (m A /10 4 GeV)/ tan β. We define ψ d = ψ d + π and ρ d = ρ d + π. There are also allowed regions with (ρ d , ψ d ) ∼ (0, π) (strange-specific VAM) and (π, any value) (bottomspecific VAM). However, as mentioned in the previous section the bottom-specific solution is strongly disfavored by the B s → µµ data.

VI. CONCLUSIONS
We have studied variant axion models (VAM's) with only a specific fermion charged under the Peccei-Quinn symmetry and their capacity to accommodate the muon g − 2 anomaly as well as the compatibility with various other experimental constraints. We start by considering the up-type specific VAM's and find that the combined χ 2 fit favors the parameters m A ∼ 15 GeV and tan β ∼ 40, the same as the type-X 2HDM. Moreover, we find that this parameter choice has no conflict with flavor observables as long as the mixing angle ρ u is sufficiently small. In particular, a small nonzero mixing angle ρ u ∼ π/100 is slightly favored by the observed B s → µµ branching ratio.
As the charm-mediated Barr-Zee diagram contribution to (g − 2) µ is negative, the charmspecific VAM is disfavored in comparison with the up-specific VAM. We therefore focus on the up-specific model and its promising signature of the rare t → uA decay followed by A → τ τ at the LHC. Current searches of A already impose some constraints in the parameter space, but do not exclude the most interesting region of m A ∼ 20 GeV. We propose an efficient search strategy that employs di-tau tagging using jet substructure information, and have explicitly demonstrated that it would enhance the sensitivity on BR(t → uA), especially in the light m A region of great interest to us. Our model also predicts that the heavy Higgs bosons have significant flavor-violating decays, such as A/H → tu. We encourage our experimental colleagues to search intensively for this flavor-changing top decay and the flavor-violating resonances.
We have also considered other variants: the muon-specific lepton sector and the downtype specific VAM's. The up-specific VAM with the muon-specific lepton sector is very interesting possibility as no tuning is required to suppress h → AA and the scenario is not constrained by the lepton universality measurements. Unlike the simplest muon-specific model, the up-specific VAM with the muon-specific sector predicts that the heavy Higgs bosons can decay into a pair of flavor-violating up-type quarks such as H/A → tu at a significant branching fraction. It suppresses the H/A → µµ decay, making the 4µ constraint at the LHC less effective and opens up more parameter space. The down/strange-specific VAM's with the muon-specific lepton sector would also be viable possibilities. The downtype specific VAM's are strongly constrained by the B s → µ + µ − decay and the B d,s and K meson mixing data, rendering a very fine-tuned parameter space. Nevertheless, such scenarios could offer another interesting possibility to explain (g − 2) µ as one of the bottom Barr-Zee diagram contribution is positive.