Muon Flavor Violation in Two Higgs Doublet Model with Extra Yukawa Couplings

The new round of experiments, MEG~II, COMET/Mu2e and Mu3e, would soon start to push the $\mu \to e\gamma$, $\mu N \to eN$ conversion and $\mu \to 3e$ frontier, while Belle~II would probe $\tau \to \mu\gamma$ and $\tau \to 3\mu$. In the general two Higgs doublet model with extra Yukawa couplings, we show that all these processes probe the lepton flavor violating (LFV) dipole transition that arises from the two loop mechanism, with scalar-induced contact terms subdominant. This is because existing data suggest the extra Yukawa couplings $\rho_{\mu e},\, \rho_{ee} \lesssim \lambda_e$, while $\rho_{\tau\mu},\, \rho_{\tau\tau} \lesssim \lambda_\tau$ and $\rho_{tt} \lesssim \lambda_t$, with $\lambda_i$ the usual Yukawa coupling of the Standard Model (SM), where $\rho_{\mu e}\rho_{tt}$ and $\rho_{\tau\mu}\rho_{tt}$ enter the $\mu e\gamma$ and $\tau\mu\gamma$ two loop amplitudes, respectively. With the $B_s \to \mu\mu$ decay rate basically consistent with SM expectation, together with the $B_s$ mixing constraint, we show that $B_s \to \tau\tau$ would also be consistent with SM, while $B_s \to \tau\mu$ and $B \to K\tau\mu$ decays would be out of reach of projected sensitivities, in strong contrast with some models motivated by the B anomalies.


I. INTRODUCTION
The study of muon properties is practically the oldest subject of particle physics, but remains at the forefront of current research. The MEG bound [1] on muon flavor violating (µFV) µ → eγ decay rate at 90% C.L. is B(µ → eγ) < 4.2 × 10 −13 , (MEG, 2016) (1) while a rather dated result of SINDRUM gives [2] B(µ → 3e) < 1.0 × 10 −12 , (SINDRUM, 1988) (2) for µ + → e + e − e + search. A third type of µFV search studies µ → e conversion on nuclei. Normalized to the muon capture rate, SINDRUM II finds [3] R µe < 7 × 10 −13 , (SINDRUM II, 2006) for µ → e conversion on gold. With schedules delayed by the current world pandemic, MEG II [4] will push the µ → eγ bound down to ∼ 6 × 10 −14 with three years of data taking. A new experiment to search for µ + → e + e − e + , Mu3e [5], plans to reach down to 5 × 10 −15 with three years of running, and is limited mostly by the muon beam intensity. Projected intensity improvements [6] by up to two orders of magnitude seem feasible, hence Mu3e can eventually reach down to 10 −16 in sensitivity. In contrast, to improve µ → eγ sensitivity beyond MEG II, innovations are needed for background suppression.
In terms of projected improvements, µ → e conversion i.e. µN → eN is perhaps the most promising. SIN-DRUM II operated at the limits of power consumption, so new developments [7] are based on the idea [8] of using special solenoids for pion capture, muon transport, as well as detection, which significantly improves muon intensity. Phase I of COMET [9] aims for R µe < 7 × 10 −15 , eventually reaching down to 10 −17 for Phase II. Similar to COMET Phase II in design, Mu2e [10] aims at 2.6×10 −17 sensitivity. Both experiments can be improved further. For example, ongoing [6] PRISM/PRIME [11] developments aim at bringing the limit eventually down to a staggering 10 −19 . Although the primary objective for µN → eN is contact interactions, it also probes [12] the dipole interaction, and can be in place to probe µ → eγ if the associated backgrounds of the latter cannot be brought under control at high muon intensity.
The current bounds and projected sensitivities on µFV processes are summarized in Table I. The impressive bounds for the muon reflect seven decades of studies. We also list the corresponding processes for τ , i.e. τ → µγ and τ → 3µ, where the current bounds are from B factories [13,14], and expectations [15] are for Belle II with 50 ab −1 in the coming decade. LHCb can [16] cross check the Belle II result on τ → 3µ after Upgrade II, i.e. at the High Luminosity LHC (HL-LHC). The heaviness of τ hence its later discovery, smaller production cross section plus difficulty in detection underlie the weaker search limits. However, its heavy mass and third generation nature offers a different window on New Physics, or equivalently beyond the Standard Model (BSM) physics.
We studied [17] τ → µγ decay previously in conjunction with h → τ µ, where h is the 125 GeV boson discovered in 2012 [18]. The context was the two Higgs doublet model (2HDM) with extra Yukawa couplings, which was called general 2HDM (g2HDM). The h boson picks up the extra ρ τ µ Yukawa coupling from the CP -even exotic Higgs boson H via h-H mixing. Given that this mixing angle, c γ , is known to be small (the alignment phenomenon [19], or that h so closely resembles the SM Higgs boson [18]), only a weak constraint is placed on ρ τ µ . Together with the extra top Yukawa coupling ρ tt , the ρ τ µ coupling induces τ → µγ decay via the two-loop mechanism [20]. Taking ρ tt ∼ λ t 1, the strength of the top Yukawa coupling of SM, it was shown that Belle II can probe the ρ τ µ λ τ 0.010 parameter space.
Taking ρ tt at O(λ t ) and ρ τ µ λ τ together, they correspond to [17] ρ f 3j λ f 3 , (j = 1) with ρ f 31 λ f 3 expected. As we will see, this relation does not hold for down-type quarks because of tight constraints from (K and) B meson physics. The probe of ρ tt by τ → µγ via the two-loop mechanism is quite significant, as ρ tt can drive [21] electroweak baryogenesis (EWBG), or the disappearance of antimatter in the very early Universe. A backup mechanism [21] is through |ρ tc | ∼ λ t (i.e. saturating Eq. (4)) in case ρ tt accidentally vanishes.
In this paper, we show that the MEG II search for µ → eγ would continue to probe which echoes |ρ ee | ∼ λ e ∼ = 0.0000029 that is suggested [22] by the recent ACME result [23] on electron electric dipole moment (eEDM), where a correlation of |ρ ee /ρ tt | ∝ λ e /λ t is implied. That is, the tiniest CP violation on Earth seems linked with the Baryon Asymmetry of the Universe (BAU)! The ρ µe , ρ ee behavior suggest which likely holds also for i = 3, and seems plausible for f = u, d. Thus, the affinity of the 1-2 sector of extra Yukawa couplings may be with the first generation, while the affinity of the 3-2 sector may be with the third generation, which echo the mass-mixing hierarchy. That the ρ d matrix is close to diagonal is a mystery. If the "septuagenarian" ("octogenarian" if counting from date of discovery) muon appear "sanitized", i.e. very much SM-like, as reflected in the weak strength of the extra Yukawa couplings mentioned, one cannot but think of the "B anomalies" that have been in vogue for almost the past decade. For a brief summary -and critique -of these B anomalies, see e.g. the "HEP Perspective and Outlook" given by one of us in Summer 2018 [24]; the situation about the B anomalies has not changed by much since then. Some of the suggested remedies of the B anomalies, especially the leptoquark (LQ) variant, relate to tree level effects hence make large impact in general. In contrast, though also at tree level, the extra Yukawa couplings have hidden themselves so well for decades, via the relations such as Eqs. (4) and (6), the near-diagonal ρ d matrix, plus alignment [19]. A second purpose of the present paper is therefore to contrast the predictions of g2HDM vs the "bold", UV-complete models such as PS 3 [25][26][27]. For this reason, we will extend the list of µFV processes beyond Table I to include various rare (semi-)leptonic B decays.
The paper is organized as follows. In the next Section, we discuss µ → eγ in g2HDM, which is pretty much parallel to what we have done for τ → µγ [17]. We show that the µ → eγ process probes the ρ µe ρ tt product in g2HDM, as well as c γ ρ µe where c γ is the h-H mixing angle. In Sec. III we cover the µ → 3e and µN → eN processes, as well as τ → 3µ. We show that the g2HDM effects are very suppressed at tree level, and that all these processes eventually pick up the µeγ or τ µγ dipole couplings.
II. THE µ → eγ PROCESS MEG II [4] has genuine discovery potential in g2HDM with extra Yukawa couplings.
Our emphasis is on phenomenological discussion, so we take Ref. [17] as template and do not recount details of the g2HDM here. The formulas used in Ref. [17], besides originating from Ref. [20], have also been checked against those of Ref. [30], although one should use caution with this reference, as it was written in a time when there was a hint for h → τ µ from CMS, which has subsequently disappeared [18]. What should be emphasized is that, in g2HDM, the exotic Higgs bosons H, A (CP -odd) and H + would naturally populate the 300-600 GeV range, but which we have surprisingly little knowledge of.
In g2HDM, flavor changing neutral Higgs (FCNH) couplings are controlled [29] by the mass-mixing hierarchy, hence the one loop diagram, Fig. 1(left), is expected to be highly suppressed [20] by multiple chirality flips. Using the one loop formula of Ref. [17]  of indices, we assume ρ µµ ρ µe from an intermediate muon in the loop is negligible compared with ρ * τ µ ρ τ e from an intermediate τ , which is even more so the case for an intermediate e. We illustrate this "one loop benchmark" in Fig. 2 for ρ τ µ = ρ µτ = λ τ and ρ τ e = λ e , and for m A = m H +200 GeV (or with H ↔ A interchanged). The effect by itself is out of reach for any time to come, unless A, H are very light. In fact, for m A = m H ∈ (300, 500) GeV, due to a cancellation mechanism, the MEG or the future MEG II bounds would allow ρ τ µ ρ τ e at O(10 4 ) times larger than λ e λ τ , which is very accommodating. For nondegenerate m H = 300 GeV, m A = 500 GeV, we find ρ τ µ ρ τ e /λ e λ τ 17 by MEG can be improved to 6  It is the two loop mechanism [20] that is of interest for g2HDM, where the ρ µe coupling induces µ → eγ decay by inserting the φ → γV * vertex (φ = h, H, A, see Fig. 1(center) and 1(right)) related to the h → γγ process, with V = Z subdominant. Following Ref. [17] for τ → µγ, we define two BSM benchmarks for illustrating two loop effects. Taking the extra top Yukawa coupling ρ tt 1 while setting c γ = 0, one maximizes the H, A effect but decouples the h boson. This "BSM benchmark" is illustrated in Fig. 2, where ρ µe = ρ eµ 0.3λ e is taken to satisfy the current MEG bound of Eq. (1) at m H or m A = 200 GeV. The MEG II experiment will continue to probe ρ µe down to lower values.
A second benchmark illustrates the effect of the SMlike h boson, where we take ρ tt = 0 to decouple the exotic H, A scalars, but take c γ = 0.2 as a large value that may still be allowed. This "h benchmark" is also plotted in Fig. 2, giving B(µ → eγ) 10 −14 for ρ µe = ρ eµ 0.3λ e , which appears out of reach for MEG II. Depending on whether c γ is smaller or larger than 0.2, the rate would drop further or become larger, although a c γ value larger than 0.2 may not be plausible. But the rate scales only with the product of c 2 γ ρ 2 µe , and if ρ tt truly vanishes, a ρ µe value larger than 0.3λ e is allowed.
We note that, unlike the τ → µγ case where h → τ µ [18] provides a constraint [17] on c γ ρ τ µ , no realistic constraint on c γ ρ µe can be extracted from h → µe search [18] for our purpose, as µ → eγ already constrains ρ µe to be so small. On the other hand, the value of ρ tt is not known at present, except that any finite value may suffice [21] for EWBG. For instance, in trying to account for the strong bound on electron EDM by ACME [23], the smaller |ρ tt | 0.1 was chosen in Ref. [22] to ease the tension. While ρ tt at O(1) is not strictly ruled out, we stress that µ → eγ probes the ρ µe ρ tt product, hence we do not really know whether we are probing ρ µe for the BSM benchmark below the strength of λ e yet. Thus, for example, if ρ tt = 0 and EWBG is through the ρ tc mechanism [21], then the MEG bound of Eq. (1) only requires ρ µe = ρ eµ 1.9λ e for our h benchmark, and MEG II could probe down to 0.7λ e . Both values are still in accord with Eq. (6), but we note that if c γ is lower than the value of 0.2 used, which seems likely, then the allowed ρ µe range would rise.
As a passing remark, we expect τ → eγ to be much suppressed compared with τ → µγ in g2HDM, as ρ τ e is expected to be much smaller than ρ τ µ .

III. OTHER µFV PROCESSES
A. µ → 3e and τ → µγ, 3µ As Mu3e would start soon to finally probe below the old SINDRUM bound of 10 −12 , Eq. (2), we estimate the µ → 3e rate. We find, consistent with Ref. [31], the simple tree level formula for µ → 3e, where we ignore extra Yukawa coupling corrections to the muon decay rate Γ µ [28], y φij are Yukawa couplings for φ = h, H, A that can be read off from Eq. (3) of Ref. [17], andm φ are scalar masses normalized to v.
The relevant effective Lagrangian is given by [34,35] where C L,R T correspond to the µeγ dipole, while C SL(R) qq are coefficients to contact terms generated by scalar exchange. There are no current-current interactions at tree level in g2HDM. One computes the conversion rate Γ µ→e and normalize to the muon capture rate to get R µe . The conversion rate is given by 10) where the L and R effects add in quadrature, and S p,n accounts for lepton-nucleus overlap. For gold, we use [36] D = 0.189, S p = 0.0614, and S n = 0.0918. In Eq. (10), relates to nucleon matrix elements, f p,n q , that account for the quark content of the proton, where we use f p u = f n d = 0.024, f p d = f n u = 0.033 [37], f p s = f n s = 0.043 [38]. For heavy quarks, we follow Ref. [37] and use f p,n [39] for Q = c, b, t. In g2HDM, the tree level contribution can be written in terms of Wilson coefficients [35] for the contact terms induced by scalar φ = h, H, A boson exchange, whereŷ φeµ (ŷ φqq ) is normalized to λ µ (λ q ), and one flips [17] for B(τ → µγ) formulas). The µeγ dipole again dominates µN → eN conversion, with contact terms subdominant. For our benchmark, we obtain the conversion ratio R µe | contact 2.4 × 10 −16 for gold as an example, while R µe | dipole 1.6 × 10 −15 . Here we have used ρ qq = λ q for all quarks, except ρ tt 0.4 as inferred from MEG bound with our benchmark. We note that contact terms are relatively important in µ → e conversion compared to µ → 3e process. These values can be probed at COMET and Mu2e. In fact, these experiments are posed to overtake MEG II in probing µ → eγ in g2HDM. Furthermore, if observed, together with the knowledge of nuclear matrix elements, one can use several different nuclei to probe and extract the effect of the contact term(s) in Eq. (9). We see that the extra ρ µe and ρ ee couplings of g2HDM hide very well so far from muon probes. It is with the help of extra ρ tt coupling via the two loop mechanism [20] for µ → eγ decay that MEG constrains ρ µe λ e (see Eq. (6)). MEG II would continue this program, but the µN → eN experiments, COMET and Mu2e, would become competitive when 10 −15 sensitivity is reached. Mu3e can confirm the dipole nature once µ → 3e is also observed with high muon intensity upgrades. Likewise, τ → µγ would probe ρ τ µ modulo ρ tt , but the τ → 3µ process seems out of reach for Belle II (hence LHCb) if g2HDM holds, even if Belle II quickly observes τ → µγ. Thus, while there remains hope for discovery, µFV physics look "sanitized" within g2HDM that possesses these extra ρ (and ρ tt ) Yukawa couplings, which bears witness to the long history of muon research.

IV. CONTRAST: MUON, OR BOLD
In this section, we contrast the "sanitized" muon front of the previous sections with what we dub the "bold" BSM front inspired by B anomalies. We refer to Ref. [24] for a discussion of all the current B anomalies, including cautionary notes on the experimental results. Extending from µFV, we discuss BSM effects in (semi-)leptonic B decays, be it BSM enhancement of B q → τ τ , or the purely BSM decays B q → τ µ, B → Kτ µ. We also touch upon the B q → µµ and B → µν, τ ν decays, which already appear to be SM-like in rate.
A. BSM-enhanced: Bq → τ τ Modes The "BaBar anomaly" in B → D ( * ) τ ν [18,24] suggests a large tree level BSM effect interfering with the SM b → cτ ν amplitude. Based on general arguments, it was pointed out [40] that such a large effect should be accompanied by similar effects in b → sτ τ . Note that, because of the difficult τ + τ − signature, the experimental bounds [18] are rather poor. Projecting from the BaBar anomaly, Ref. [40] suggested that B(B s → τ τ ) ∼ 5×10 −4 (or larger) is possible, to be compared with 7.7 × 10 −7 in SM [41]. Similarly, B(B → K ( * ) τ τ ) ∼ 10 −4 is projected. The theory suggestion was in part stimulated by the LHCb search [42], based on 3 fb −1 Run 1 data, setting the 90% C.L. bound of which is an order of magnitude higher than the theory suggestion. Likewise, the only limit on 3-body search, B(B + → K + τ + τ − ) < 2.3 × 10 −3 from BaBar [43], is also poor. One suffers from lack of mass reconstruction capability, and only at the HL-LHC after LHCb Upgrade II [16] can the sensitivity reach ∼ 5×10 −4 , touching the upper reaches of projected enhancement [40]. Belle II plans to take some Υ(5S) data early on, and projects the reach of ∼ 8.1 × 10 −4 [15]. As the environment is clean, Belle II would likely take more Υ(5S) data if the BaBar anomaly is confirmed. For B → K ( * ) τ τ , the Belle II sensitivity of ∼ 2 × 10 −5 [15] should be able to probe the range of interest at O(10 −4 ). We list the current limits and future prospects for the B q → τ τ and B → K ( * ) τ τ modes in Table II. B. Purely BSM: Bq → τ µ and B → Kτ µ Modes The B anomalies suggest lepton universality violation (LUV), such as B → D ( * ) τ ν vs B → D ( * ) µν, or B → K ( * ) µµ vs B → K ( * ) ee. It was suggested [44] on general grounds the possibility of accompanying lepton flavor violation (LFV), giving rise to interesting decays such as B q → and B → K for = . As the B anomalies persisted, serious model building went underway, and we take the so-called PS 3 model [25] as the standard bearer for ambitious UV-complete models (which we term "bold"). To handle severe low energy constraints and focus on the third generation, the Pati-Salam (PS) model [45] comes in three copies. The presence of leptoquarks (LQ) in the Pati-Salam model induce the decays such as B q → τ µ and B → Kτ µ, where detailed phenomenology was given in Ref. [26].
These are striking signatures! Before long, with 3 fb −1 Run 1 data, LHCb sets [46] the 90% C.L. limit of which contrasts with the poor performance of Eq. (13) for B s → τ τ . This limit practically ruled out the entire B(B s → τ µ) range projected by Ref. [26], forcing model builders to introduce [27] right-handed LQ interaction as tune parameters. In so doing, B s → τ τ and B → Kτ τ decays get enhanced [27], which is in accordance with Ref. [40]. It would be interesting to see the full 9 fb −1 Run 1 + 2 result for B s → τ µ, τ τ modes. Perhaps because the analysis of Ref. [46] was still underway when the LHCb Upgrade II document [16] was being prepared, we cannot find the sensitivity projections of B s → τ µ for full LHCb Upgrade II data (and neither for Belle II), hence we state this explicitly in Table II. BaBar has searched [47] for the companion B → Kτ µ mode. Using full hadronic tag to reconstruct the other charged B hence with full kinematic control, by measuring K + and µ − , one projects into the m τ window without reconstructing the τ . The result at 90% C.L. is [47] B(B + → K + τ + µ − ) < 2.8 × 10 −5 , (BaBar, 2012) (15) < 3.9 × 10 −5 , (LHCb, 2020) (16) for the better measured charge combination, and Eq. (16) is the recent LHCb measurement [48] with full 9 fb −1 Run 1 + 2 data. We first note that Belle has not performed this measurement so far, despite having more data than BaBar. The second point to stress is that, although the LHCb result may not appear competitive at first sight, they exploit B * 0 s2 → B + K − decay and use the K − to tag [49] the B + for full kinematic control, putting LHCb in the game for the B + → K + τ + µ − pursuit, and making things more interesting for the Belle II era.
LHCb also places the best bounds [50] for B(B s → µe) < 5.4×10 −9 and B(B d → µe) < 1.0×10 −9 , as well as B(B + → K + µ + e − ) < 6.4 × 10 −9 [51]. The current limits and future prospects for the B q → τ µ and B → K ( * ) τ µ modes are listed in Table II. The µe counterparts are also listed, but aside from the comment given in Ref. [44], it is not easy from the model building point of view to make projections that are experimentally accessible.
C. SM-like: Bq → µµ and B → τ ν, µν Modes It is useful to recall that B s → µµ was a front runner [18] in the 2000's as possibly greatly enhanced, but a few years into LHC running, the B s,d → µµ decays became consistent with SM: the PDG values [18]  −0.35 )×10 −9 and B(B 0 → µµ) < 1.6 × 10 −10 at 90% C.L. A discrepancy for B s → µµ at ∼ 2σ is suggested, which was already indicative with PDG average, while the low value for B d → µµ is in part due to the negative central value from ATLAS. We will use the PDG result (see Table II), which should be good enough for our illustrative purpose. In any case, the B d mode is not yet observed, but should emerge with sufficient data. The estimated errors for LHCb at 300 fb −1 [54] is given in Table II. Naturally, models such as PS 3 do not give large enhancement for B q → µµ, but B s → µµ serves as a reminder of how things might evolve for the B anomalies, in as much as these "anomalies" are data-driven.
The B → τν rate receives a neat correction [55] in type two 2HDM (2HDM-II), while Belle measurements [18] have settled around SM expectation, and in fact provides a constraint [27] on PS 3 . Since the correction factor of Ref. [55] does not depend on the flavor of the charged lepton, one has the ratio R µ/τ B = B(B → µν)/B(B → τν) ∼ = 0.0045 for both SM and 2HDM-II [56]. But some subtleties such as V tb /V ub enhancement and non-detection of neutrino flavorν i (it could beν τ that escapes), as discussed in Ref. [28], allow R µ/τ B to deviate from the expected value precisely in g2HDM, and one probes the ρ τ µ ρ tu product. Note that our actual knowledge [57] of ρ tu is rather poor compared with what is suggested in Eq. (4). The recent Belle update [58] gives B(B → µν) = (5.3 ± 2.2) × 10 −7 , (Belle, 2020) (17) where we add the statistical and systematic errors in quadrature, treating as Gaussian. Eq. (17) is consistent with SM, but gives a two-sided bound, i.e. B(B → µν) could be above or below the nominal SM value [28] of 3.9 × 10 −7 , and the R µ/τ B ratio provides a good probe of g2HDM for Belle II in the next few years.
We reiterate that, though B q → µµ are loop processes while B → τ ν, µν are at tree level, and the measured values still have to settle, none are in disagreement with SM expectation, which put constraints on BSM models inspired by B anomalies, as well as g2HDM. The current status and future prospects are listed in Table II.

D. Contrasting g2HDM with "Boldness"
Having presented the status of various (semi-)leptonic rare B decays, where some striking projections arise from models motivated by B anomalies, we turn to contrasting with g2HDM, the projections of which conform better with the more "sanitized'" tradition of muon physics.

From µFV to PS 3
The purely leptonic µFV processes discussed previously, such as µ → eγ in Sec. II, and µ → 3e, τ → µγ, τ → 3µ and µN → eN in Sec. III, are illustrated in Fig. 3. That is, the current bounds and future sensitivities listed in Table I are plotted as blue solid and orange dotted circles, respectively. None are so far observed, so the current MEG bound on µ → eγ is also marked by a downward red ⇓ for the g2HDM projection, where, for sake of illustration, we have set up a benchmark consistent with Eqs. (4) and (6) and with small h-H mixing. As the scalar-induced contact effect is rather small, the dipole µ → 3e transition is also marked by a downward red ⇓. However, though subdominant, the scalar-induced contact effect for µN → eN is not negligi-ble, and the downward red ⇓ shows the combined dipole plus contact effect, which is destructive. The sign of interference, however, could be easily flipped, so the actual possibilities are considerably broader. The τ → µγ rate with this benchmark is also illustrated, which falls toward the lower range of Belle II reach, while we predict that τ → 3µ is out of reach in g2HDM.
Likewise, the current bounds and future sensitivities for (semi-)leptonic rare B decays discussed in Sec. III.A and III.B are also plotted in Fig. 3. Of interest here is some two-sided projections, as they stand at present, for the striking signatures arising from PS 3 [27]: while B(B → Kτ µ) scales down from B(B s → τ µ) by a factor of ∼ 9, and for B(B → Kτ τ ) vs B(B s → τ τ ) the factor is ∼ 13. We do not show the B(τ → µφ) mode [27] as it seems out of Belle II reach. These ranges are shown in Fig. 3 as grey shaded bands, where existing bounds for B s → τ µ and τ → µγ cut into the upper ranges of PS 3 projections, and are the points of our comparison with g2HDM expectations. As noted, the future sensitivity for B s → τ µ is not quite known at present.
We note further that, with τ → µγ generated by LQ in the loop, there is an anti-correlation with B(B s → τ µ) within the PS 3 scenario [27]: if the limit on B s → τ µ is pushed further down with 9 fb −1 full Run 1 + 2 data, then B(τ → µγ) will move up and become closer to the current limit, and would be a boon to Belle II in the model scenario. Likewise, pushing down on τ → µγ would imply an increased lower bound for B s → τ µ, τ τ in PS 3 . These bounds and (anti-)correlations allow the PS 3 model to "provide a smoking-gun signature for this framework . . . or could lead us to rule it out" [27].
The B q → µµ and B → µν, τ ν processes discussed in Sec. III.C are plotted differently in Fig. 3, as they are now mostly found to be consistent with SM expectations (marked as red ). The measured B s → µµ rate, shown as the narrow green shaded band, covers the SM expectation but appears slightly on the low side. Likewise, B → τ ν is also measured to be consistent with SM, which Belle II would continue to probe. For B d → µµ, we plot the more conservative upper limit from PDG, while the latest Belle update on B → µν gives a two-sided bound, which is illustrated by the broad green shaded band that covers the SM expectation. The PS 3 model shies away from processes that involve only muons, but B → τ ν does provide [27] some constraint.

The bq
Processes in g2HDM The rare B decay processes of interest (we only quote results for B → ν) are in the form of bq 4-fermi interactions. Thus, the extra Yukawa couplings that enter on the quark side are ρ bs , ρ bd at tree level, and ρ for ( ) = τ, µ, e on the charged lepton side. For the latter, we continue to use our benchmark values ρ τ τ , ρ τ µ = λ τ 0.010 (Eq. (4)), and ρ µe , ρ ee = λ e ∼ = 0.0000029 (Eq. (6)). The issue is that, for = , SM loop effects seem affirmed by experiment, while for = , there is no SM loop effect, and one would need the leptonic FCNH couplings in g2HDM to act. In the following, we will use tree level approach to B q → µµ to infer B q → for = case, while using loop corrections for B q → µµ to discuss B q → τ τ . In each case, the corresponding B q mixing constraints are taken into account.
It is well known that the measured [18] B q mixings can be accounted for quite well by SM loop effects. For example, the operator , with x t = m 2 t /m 2 W and S 0 (x t ) 2.35 from SM box diagram, and one just replaces s → d for B d mixing. In g2HDM, ρ bq (q = s, d) enters B q mixing at tree level, hence stringent constraints are implied.
The NP effects in B q mixings can be paramterized by defining C Bq e 2iΦ Bq = B q |H Full eff |B q / B q |H SM eff |B q . Using the 2018 NP fit performed by UTfit [59], one finds C Bs = 1.110 ± 0.090, Φ Bs = (0.42 ± 0.89) • , For sake of illustration and to reduce the number of parameters, we will treat extra Yukawas as real and assume that adding the g2HDM effect, C Bq and Φ Bq stay within 2σ ranges of Eq. (21). In g2HDM, the leading effect comes from the operator O 4 = (s α Lb α )(s β Rb β ) at tree level, which constrains the product ρ sb ρ * bs , while the operators O 2 = (s α Lb α )(s β Lb β ) and O 2 = (s α Rb α )(s β Rb β ) constrain individual couplings ρ * bs , ρ sb but are less constraining. Furthermore, the coefficients of O ( ) 2 suffer cancellation between H and A contributions. Assuming O 4 dominance, one has the coefficient C 4 = −y * φbs y φsb /m 2 φ where φ is summed over h, H, A, and we take c γ = 0.05 and m H = m A = 300 GeV as before. Taking renormalization group evolution into account [61], using bag factors from Ref. [62] and decay constants from Ref. [63], we find |ρ sb ρ * bs | (0.021 λ b ) 2 . In similar vein, we obtain |ρ db ρ * bd | (0.0046 λ b ) 2 , where we take λ b 0.016. Assuming reality, we adopt ρ sb ρ * bs 0.021λ b ∼ 0.00034, and ρ db ρ * bd 0.0046λ b ∼ 0.000074, respectively. With ρ bs , ρ bd and ρ so small, one may expect B q → modes would be SM-like in g2HDM, which is the case for B s → µµ, and to some extent B d → µµ as well: the measured strengths are indeed SM-like. At tree level, we find that B s → µµ gives stringent constraints on ρ bs(sb) , and can be on a par with those from B s mixing constraints. For example, for our benchmark of c γ = 0.05, ρ µµ = λ µ ∼ 0.00061, and m H = m A = 300 GeV, the 2σ range of B(B s → µµ) gives the bound of ρ bs = ρ sb 0.019λ b ∼ 0.00030, which is slightly more stringent than B s mixing. On the other hand, due to poorer measurement of B d → µµ so far, bounds on ρ db(bd) from B d → µµ are weaker than B d mixing. Thus, by the fact that B q → µµ rates are already SM-like in g2HDM, we expect B q → τ τ to be not so different from SM expectations if tree contributions prevail.
With ρ sb = ρ bs and ρ db = ρ bd so suppressed, one has to take up-type extra Yukawa couplings into account, which contribute to B q mixings and B q → at one loop order. The leading contributions to B q mixings come from the same box diagrams as SM, but with either one W + or both replaced by H + , which also generates O 1 . Considering the effect of ρ tt only, we obtain ∆C W H H for the HH box correction. Here, H stands as shorthand for H + , and the loop functions f and g are given in the Appendix.
Considering this one loop contribution by itself gives a constraint on the ρ tt -m H + plane. For example, for a 300 GeV charged Higgs, we find |ρ tt | 0.8, and similar bound from B d mixing as well. However, we caution that inclusion of additional up-type Yukawa couplings can induce cancellation effects, thereby weakening the constraint. Most notably, with ρ ct as small as O(10 −2 ), one can relax ρ tt to ∼ 1. As stated, we avoid cancellations and discuss tree and loop contributions separately. The same treatment is applied to rare B decays, and we continue to assume ρ qb = ρ bq and take them as real.
B q → µµ can also receive significant contribution through one loop diagrams, where the leading effect is from Z penguins with H + and top in the loop. This is a lepton flavor universal contribution and modifies the coefficient of O 10 = (sγ α Lb)(¯ γ α γ 5 ). We find [64] the ρ tt correction ∆C H + 10 = |ρ tt | 2 h(yx t )/16πα e , where the loop function h is given in the Appendix. The other loop diagrams are suppressed in the small ρ d ij approximation and/or by extra lepton ρ Yukawa couplings (such as in box diagrams). Similar to B q mixing, ∆C H + 10 puts a constraint on the ρ tt -m + H plane. For m H = m A = 300 GeV, we obtain ρ tt 0.4 for 2σ range of B(B s → µµ), which is more stringent than B s mixing. However, as already noted, the bound weakens if one includes other extra Yukawa couplings such as ρ ct , which receives |V cs /V ts | enhancement. In our numerical analysis, we therefore keep the tree level and one loop discussions separate, and only comment on cancellation effects later. Since LFV decays such as B s → for = arise at tree level in g2HDM, we give tree level upper reaches with ρ sb and ρ bs satisfying 2σ range of B s mixing and B s → µµ.
The effective Hamiltonian for flavor violating B s → τ µ and B → Kτ µ decays is of the form [65], where and O S,P are obtained by exchanging L ↔ R. Although C S and C P vanish for = in SM, tree level exchange of scalar bosons in the g2HDM lead to with φ summed over h, H and A, and C S,P is obtained from C S,P by changing y * φbs → y φsb . For B s → decay, we use [65] Bs is the decay width of the heavy B s state, m ± = m ± m , and ∆C i = C i −C i . With our benchmark of c γ = 0.05, m H = m A = 300 GeV and leptonic couplings, and the allowed range of ρ sb,bs extracted from flavor conserving B q → µµ (and in conjunction with bounds from B s mixing), the projections of various LFV B decays in g2HDM are given in Fig. 3 as red ⇓. Analogously, for B → K , we use [65] dB(B → K )/dq 2 = N 2 where ϕ S is a function of B → K form factors and N K a normalization factor. Both are q 2 dependent, and explicit expressions can be found in Ref. [65].

Comparing g2HDM with PS 3
Let us now make the comparison of the spectacular PS 3 projections with the modesty of g2HDM.
We have taken a simplified approach of treating B s → µµ and B s mixing either at tree level, or at one loop level, but not both simultaneously. Either way, the fact that B s → µµ is already consistent with SM expectation implies B s → τ τ in g2HDM should also be SM-like, which is more so if loop is dominant. This is in contrast with the sizable enhancement projected in PS 3 (grey shaded band in Fig. 3), which can be probed by LHCb Upgrade II, or dedicated runs by Belle II on Υ(5S). For g2HDM, some enhancement (or suppression) of B s → τ τ is possible, given that tree effect is controlled by ρ τ τ which is at O(λ τ ), while tree effect for B s → µµ is controlled by ρ µµ which is at O(λ µ ). But these order of magnitude estimates suggest that bridging the two orders of magnitude gap is unlikely, and g2HDM should be distinguishable from PS 3 . In any case, measurement of B s → τ τ is a challenge, while prospects for B d → τ τ at Belle II remains to be seen.
More promising for PS 3 -type of models would be B s → τ µ, which can saturate the current bound, and the discovery, perhaps even with Run 1 + 2 data of LHCb, would be truly spectacular. Projections for g2HDM, however, appears quite out of reach, as it is three orders of magnitude below the lower reach of the PS 3 projection. But our previous caution applies, that an order of magnitude enhancement is not impossible, though it would still be far out of reach. In addition, if one allows cancellation between tree and loop effects in both B s → µµ and B s mixing, it is not impossible that ρ bs(sb) can be larger than our suggested values, resulting in possible further enhancement of B s → τ µ. The challenge is with experiment. As we noted in Table II, the projected sensitivities, be it for LHCb or Belle, are not known publicly.
At this point, we remind the reader of the "seesaw" between B s → τ µ and τ → µγ within PS 3 [27]. Depending on analysis prowess and/or data accumulation speed, either measurement could be improved substantially in the next couple of years. If one limit is pushed down, then the prospect for the other would rise in PS 3 . In contrast, for g2HDM, while there is discovery potential for τ → µγ, one does not expect B s → τ µ to be observed any time soon. The situation for the B → Kτ µ mode is similar, where the projected sensitivity is again not yet clear, and we have given the number for Belle II in Table II, which barely starts to touch the PS 3 range. The situation for B d → τ µ in g2HDM would correlate with the outcome of B d → µµ measurement, while the PS 3 model does not provide predictions. Neither models foresee B q → µe and B → Kµe modes to be observable. Our projections for g2HDM are given in Fig. 3.
As we have also listed in Fig 3, B → µν provides a unique probe [28] of g2HDM, while B → τν again appears SM-like already. These are charged B decays, in contrast to neutral B decays for B q → . As a re-minder for purely leptonic µFV processes, the µ → eγ, µN → eN and τ → µγ processes have discovery potential, all basically probing the µeγ and τ µγ dipoles in g2HDM, though the µN → eN process can pick up contact effects. In contrast, µ → 3e and τ → 3µ would be higher order effects of the respective dipole transitions. We mention in passing that muon g − 2 would not be affected in g2HDM, while muon EDM, d µ , would likely scale by m µ /m e ∼ 200, and |d µ | 2 × 10 −27 e cm seems, unlike electron EDM d e , far out of experimental reach.

V. DISCUSSION AND CONCLUSION
There are good reasons to take g2HDM, the general two Higgs doublet model with extra Yukawa couplings, very seriously. By discovering the h boson and finding that it closely resembles the SM Higgs boson, we now have one weak scalar doublet. Whether by Gell-Mann's Totalitarian Principle [66] or the Principle of Plentitude [67], with the existence of one scalar doublet, there should be a second doublet, and by the same argument, extra Yukawa couplings. To declare [68] Natural Flavor Conservation (NFC) and forbid extra Yukawa couplings, or using a Z 2 symmetry to implement it, are not only not natural, but quite ad hoc or artificial. Had supersymmetry (SUSY) emerged at the LHC, it would have given credence to 2HDM-II, a type of 2HDM with Z 2 symmetry to forbid extra Yukawa couplings. But the lack of evidence for SUSY so far [18] suggests that the SUSY scale is considerably above v, the electroweak symmetry breaking scale.
With three types of charged fermions, each coming in three generations, and that the extra Yukawa couplings are naturally complex, one has 54 new Yukawa couplings, which may appear excessive. There are also 7 new Higgs parameters, which include the h-H mixing parameter c γ , and the exotic Higgs masses m H , m A and m H + . But the increment of 54 new flavor parameters is on top of the existing plentitude of 13 within SM, while the structure built-in by Nature seems to have helped "obscure" the presence of the extra Higgs sector parameters: as we have stated, m H , m A and m H + in g2HDM naturally populate the 300-600 GeV range. The latter follows if one takes [19] the principle that all dimensionless parameters in the Higgs potential are O(1) in strength, with v as the only scale parameter. It is curious to note that, with ρ tt naturally O(1) because it is a cousin to λ t ∼ = 1, it may help keep c γ small [69]. So the alignment phenomenon may be emergent, while ρ tt could drive EWBG quite effectively. At any rate, and as we have emphasized, the flavor parameter structure seems to have hidden itself rather well from our view, obscuring also the extra Higgs bosons, which we know so little about.
The flavor structure was first revealed in the 1970s through the fermion mass hierarchy, although the existence of three generations triggered Ref. [68]. But then the mixing hierarchy of |V ub | 2 |V cb | 2 |V us | 2 came as a surprise in the early 1980s, which led to the Cheng-Sher ansatz [70], suggesting that NFC may be too strong an assumption. Unknown back then was Nature's further design of alignment, which suppressed FCNH coupling effects of the light, SM-like h boson. As we stressed in the Introduction, at this point one may find fault in the near diagonal nature of the ρ d Yukawa matrix: Why would Nature turn off the FCNH effects precisely in the sector that we have the best access to? It is a mystery. But Nature has her mysterious ways, and as an experimental science we can only probe further.
In summary, the extra Yukawa couplings of g2HDM has the built-in mass-mixing hierarchy protection as exemplified by Eqs. (4) and (6), plus near diagonal ρ d Yukawa matrix and alignment. The µ → eγ and τ → µγ processes probe ρ µe ρ tt and ρ τ µ ρ tt via the two loop mechanism, and generate µ → 3e and τ → 3µ at higher order. The µN → eN process probes the combined effect of dipole plus contact terms, and by nature of the process and experimental prowess, one might disentangle the two effects. As a second theme, we do not expect LUV or LFV effects to be observed soon in (semi-)leptonic rare B decays for g2HDM. This is in contrast with the UVcomplete PS 3 model that is the epitome of the recent B anomalies, where the modes to watch are B s → τ µ, B → Kτ µ, and to a lesser extent, B s → τ τ , B → Kτ τ ; discovering only τ → µγ does not distinguish between the two scenarios. For g2HDM, besides the aforementioned µFV processes, B → µν may be the mode to watch, which probes ρ τ µ ρ tu .