Hadronic light-by-light contribution to the muon $g-2$ from holographic QCD with massive pions

We extend our previous calculations of the hadronic light-by-light scattering contribution to the muon anomalous magnetic moment in holographic QCD to models with finite quark masses and a tower of massive pions. Analysing the role of the latter in the Wess-Zumino-Witten action, we show that the Melnikov-Vainshtein short-distance constraint is satisfied solely by the summation of contributions from the infinite tower of axial vector meson contributions. There is also an enhancement of the asymptotic behavior of pseudoscalar contributions when their infinite tower of excitations is summed, but this leads only to subleading contributions for the short-distance constraints on light-by-light scattering. We also refine our numerical evaluations, particularly in the pion and $a_1$ sector, which corroborates our previous findings of contributions from axial vector mesons that are significantly larger than those adopted for the effects of axials and short-distance constraints in the recent White Paper on the Standard Model prediction for $(g-2)_\mu$.

In this paper we employ the simplest hard-wall AdS/QCD model that is capable of satisfying at the same time the leading-order (LO) perturbative QCD (pQCD) constraints on the vector correlator and the low-energy parameters provided by the pion decay constant f π and the mass of the ρ meson, introduced in [38,39] and called HW1 in [34,40,41], using here its generic form with finite quark masses (albeit in the simplest version of uniform quark masses). This allows us to study the role of massive pions in the axial anomaly and in the longitudinal SDCs. We find that in hQCD there is in fact a certain enhancement of the asymptotic behavior when summing the infinite tower of pseudoscalars compared to the behavior of individual contributions, but within the allowed range of parameters of hQCD this is insufficient to let pseudoscalars contribute to the leading terms of the longitudinal SDCs; the latter are determined by the infinite tower of axial vector mesons alone, in agreement with the expectation expressed in [35].
Moreover, we evaluate and assess a set of massive hard-wall AdS/QCD models with regard to their predictions for the pseudoscalar and axial-vector TFFs and the resulting contributions to a µ , also allowing for adjustments at high energies to account for next-to-leading (NLO) pQCD corrections. The results for the single and double-virtual pion TFFs agree perfectly with those of the data-driven dispersive results of Ref. [21], in particular when a 10% reduction of the asymptotic limit is applied. The shape of the axial vector TFF is consistent with L3 results for the f 1 (1285) axial vector meson [42]; the magnitude is above experimental values, but becomes compatible after such reductions. While hQCD is clearly only a toy model for real QCD, this success after a fit of a minimal set of low-energy parameters seems to make it also interesting as a phenomenological model (which may be further improved by using a modified background geometry and other refinements [43][44][45]).
Numerically, the axial vector contributions are close to our previous results for the HW1 model in the chiral limit (they even tend to increase with finite quark masses), and they are thus significantly larger than estimated in the WP [4]. Interestingly enough, a new complete lattice calculation which claims comparable errors [28] has obtained somewhat larger values of a HLBL µ than the WP [4], which could also be indicative of larger contributions from the axial vector meson sector.
The contributions from the excited pseudoscalars are numerically much smaller than those from axial vector mesons. They are present also in the chiral HW1 model, where they decouple from the axial vector current and the axial anomaly, but not from low-energy photons. However, the first excited pion mode has a two-photon coupling that is significantly above the experimental constraint deduced in [25]. Nevertheless, the contribution of the first few excited modes together is smaller than those of the Regge model of [24,25], which is compatible with known experimental constraints. This model was, however, not meant primarily as a phenomenological model for estimating the contributions of excited pseudoscalars but rather as a model for estimating the effects of the longitudinal SDCs, which according to the hQCD models are instead provided by the axial vector mesons. As far as these effects are concerned, our present results are not far above the estimates obtained in [24,25,46,47]; all are significantly below the estimates obtained with the so-called MV model [8], where the ground-state pion contribution is modified. This paper is organized as follows. In Sec. II we review hQCD with an anti-de Sitter (AdS) background that is cut off by a hard wall and where in addition to flavor gauge fields a bifundamental scalar bulk field encodes the chiral condensates with or without finite quark masses. In addition to the boundary conditions considered originally in Ref. [38,39] where vector and axial vector gauge fields are treated equally (HW1), we also consider the other set of admissible boundary conditions that appear in models without a bifundamental scalar, the Hirn-Sanz (HW2) model [48] and the top-down hQCD model of Sakai and Sugimoto [49,50]. (This modification of the HW1 model will be referred to as HW3.) As has been pointed out in [51], this has the advantage of removing infrared boundary terms in the Chern-Simons action that in the HW1 model need to be subtracted by hand [52]. Additionally we consider the generalization of different scaling dimensions for quark masses and chiral condensates proposed in [51], which permits to fit the mass of the first excited pion or the lowest axial vector meson. In Sec. III, we analyse the consequences of the axial anomaly in the HW models, in particular the role of excited pseudoscalars, as encoded by the bulk Chern-Simons action for the five-dimensional gauge fields. In Sec. IV we study SDCs on TFFs and the HLBL scattering amplitude, showing analytically that the MV-SDC is always saturated by axial vector mesons, while the symmetric longitudinal SDC is fulfilled to 81%. Within the bounds of allowed scaling dimensions for the bulk bifundamental scalar, the infinite tower of excited pseudoscalar cannot change this. Finally in Sec. V we evaluate the set of massive HW models numerically, comparing the resulting masses, decay constants, and TFFs with empirical data, before calculating the corresponding contributions to a µ .

II. HARD-WALL ADS/QCD MODELS WITH FINITE QUARK MASSES
In the hard-wall AdS/QCD models of Ref. [38,39,48], the background geometry is chosen as pure anti-de Sitter with metric with a conformal boundary at z = 0. Confinement is implemented by a cutoff at some finite value of the radial coordinate z 0 , where boundary conditions for the five-dimensional fields are imposed. Left and right chiral quark currents are dual to five-dimensional flavor gauge fields, whose action is given by a Yang-Mills part 1 where P, Q, R, S = 0, . . . , 3, z and , and a Chern-Simons action S CS = S L CS − S R CS with (in differential form notation) up to a potential subtraction of boundary terms at z 0 to be discussed below. As dual field for the scalar quark bilinear operator, a bi-fundamental bulk scalar [38,39] X, parametrized as [53] is introduced, with action where The five-dimensional mass term is determined by the scaling dimension of the chiral-symmetry breaking order parameterq L q R of the boundary theory. With M 2 X = −3, one obtains vacuum solutions v(z) = M q z + Σ z 3 , where M q and Σ are model parameters related to the quark mass matrix and the quark condensate. 2 Following Ref. [51] we shall also consider generalizations away from M 2 In the original hard-wall model of Ref. [38,39] (HW1), chiral symmetry preserving boundary conditions are employed at the infrared cutoff, F L,R zµ (z 0 ) = 0, whereas in the model of Hirn and Sanz [48] (called HW2 in [34,40,41]) chiral symmetry is broken through (B L µ − B R µ )(z 0 ) = 0 and (F L zµ + F R zµ )(z 0 ) = 0 without the introduction of a bi-fundamental scalar. These latter conditions arise naturally in the top-down holographic chiral model of Sakai and Sugimoto [49,50] with flavor gauge fields residing on flavor branes separated by an extra dimension at the boundary but connecting in the bulk. In [51], it was proposed to use such symmetry breaking boundary conditions also in the presence of symmetry breaking by a bifundamental scalar, because it avoids infrared boundary contributions from the CS action. This variant of the HW1 model will be termed HW3 in the following.
In both, the HW1 and the HW3 model, chiral symmetry can be broken additionally by finite quark masses. In the present paper, we shall only consider the flavor symmetric case, where both M q and Σ are proportional to the unit matrix. For both models we will also consider generalizations away from M 2 X = −3, which, as Ref. [51] has found, allows the HW3 model to fit simultaneously the masses of the lightest and the first excited pion. As shown in Appendix A, all these models lead to a Gell-Mann-Oakes-Renner (GOR) relation in the limit of small quark masses, to wit, where α = 1 for the standard choice M 2 X = −3, and 0 < α < 2 for the admissible generalizations [51] −4 < M 2 X < 0.

A. Vector sector
As long as flavor symmetry is in place, vector mesons, which appear in the mode expansion of V µ = (B L µ + B R µ )/2, have (for all hard-wall models considered here) quark-mass independent holographic wave functions given by so that canonically normalized pion fields Π n (x) appear in a mode expansion according to In unitary gauge one has Normalizable and non-normalizable y-modes are related by the sum rule [53] y S (q, z) = n m 2 n y n ( )y n (z) where m n are the masses of the tower of pions and f πn = −y n ( )/g 5 their decay constants, For later use we also define the Green function with boundary conditions as imposed on the mode functions y n , so that In the chiral limit, the infinite tower of massive pions continues to be present, but their decay constants vanish: f πn → 0 for n ≥ 2 as M q → 0. The mode n = 1 becomes massless with y S → g 5 f π y 1 , f π = f π 1 . (See Appendix A for more details.)

III. AXIAL ANOMALY AND MASSIVE PIONS
The Chern-Simons action (3) implements the axial anomaly and the associated V V A coupling [52,57]. After integration over the holographic direction, one obtains a Wess-Zumino-Witten action for the mesons. This contains vertices involving photons, which are described by V µ (x, z) = A e.m.
µ (x)QJ (z), with Q denoting the charge matrix of quarks, and the fields encoded by A M , namely the infinite towers of pions and axial vector mesons.
In the chiral limit, where all massive pions decouple from the axial vector current through vanishing decay constants, the resulting pion TFF has been analysed in various holographic QCD models in [58,59] and also in [40,41], where the HLBL contribution to the anomalous magnetic moment of the muon was studied. This was also done in [60] for the ground-state pseudo-Goldstone bosons in a version of the HW1 model with finite quark masses.
The contribution of the infinite tower of axial vector mesons, which is crucial for satisfying the Melnikov-Vainshtein short-distance constraint, has been worked out in [34,35] for the chiral version of the HW1 model and the inherently chiral Hirn-Sanz (HW2) model. We refer to these latter references for details on the axial vector contributions, which as we shall see do not change qualitatively away from the chiral limit, concentrating here on the generalizations necessary to include the contributions of massive pions in the HW1 and HW3 models.
In the holographic gauge A z = 0, the tower of pions contributes to the Chern-Simons action through A µ = ∂ µ φ, whereas in the unitary gauge it appears in A U z = ∂ z φ. In the latter case, the anomalous interactions of the pions are described by where B is an infrared subtraction term required by the HW1 model and introduced in [52], but which disappears in the HW2 and HW3 models due to their different boundary conditions at z = z 0 . With (29), this yields the following pion TFFs (for which tr(t 3 Q 2 ) = 1/6), with where the last term vanishes automatically in the HW3 model. This can also be written in terms of φ and π modes, which implies (equally in the HW1 and HW3 models) since J (0, z) ≡ 1. In the chiral limit one has a constant π 1 (z) ≡ −1/(g 5 f π 1 ) as sole contribution, yielding F π 1 γγ ≡ F π 1 γ * γ * (0, 0) = N c /(12π 2 f π 1 ). We note parenthetically that the above subtraction term agrees with the one introduced in [52], but the bulk term therein involves ∂ z (φ − π), which is correct only for the n = 1 mode in the chiral limit where π 1 (z) becomes a constant. Away from the chiral limit, the expression proposed in [52] would in fact give K n (0, 0) = 0 for all n, because the boundary conditions on the normalizable modes of φ and π are such that they vanish at z = 0. With nonzero quark masses, K n (0, 0) can only be given numerically. However, the sum rule (30) implies that y S (0, z) = g 5 ∞ n=1 f πn y n (z) and consequently or showing that then all massive pions contribute to the anomaly. In Fig. 2 this is illustrated with the HW1m model (specified in Sec. V). However, also in the chiral limit, where f πn → 0 for n > 1 so that the massive pions decouple from the axial vector current and the axial anomaly, K n (0, 0) and F πnγγ remain nonzero.
The asymptotic behavior of (43) reads [34] A n (Q 2 which also agrees with the pQCD behavior derived recently in Ref. [63]. Compared to the most general expression possible for the axial vector amplitude M(a → γ * γ * ) [67][68][69], the holographic result (42) has one asymmetric structure function (denoted C(Q 2 1 , Q 2 2 ) in [68]) set to zero; see Ref. [69] for a compilation of the available phenomenological information.
B. Longitudinal short distance constraint on HLBL amplitude In the Bardeen-Tung-Tarrach basis of the HLBL four-point function [70], the longitudinal shortdistance constraint of Melnikov and Vainshtein [8] in the region Q 2 1 ∼ Q 2 2 Q 2 3 m 2 ρ and Q 4 = 0, which is governed by the chiral anomaly and protected by its nonrenormalization theorem, reads [24,25] lim The short distance behavior of the form factors of both pseudoscalars and axial vector mesons implies that each individual meson gives a pole contribution withΠ 1 3 . However, in Ref. [34,35] it was shown that in holographic QCD a summation over the infinite tower of axial vector mesons changes this. The infinite sum yields where G A is the Green function for the axial vector mode equation satisfying at q 2 = 0. For large Q, Q 3 m ρ , (46) is dominated by z, z z 0 , where one can approximate J (Q, z) → QzK 1 (Qz), and when z = ξ/Q and z = ξ /Q 3 . This asymptotic behavior of G A holds true in all HW models, including those with finite quark mass term, because at small z one has β(z) ∼ z 2(∆ − −1) with ∆ − = 1 for the standard choice M 2 X = −3, leading to n = 2 in (48); with generalized M 2 X , one has n > 0 as long as ∆ − > 0, which corresponds to M 2 X < 0. In all cases, we thus obtain for for g 2 5 = 12π 2 /N c = (2π) 2 , as required by (45). This already implies that in the massive case the infinite tower of pions (and the other pseudo-Goldstone bosons) should not contribute to the Melnikov-Vainshtein constraint, i.e., summing the infinite number of contributions should not change the asymptotic behavior of the individual contributions in the same way as it happens with axial vector mesons. In order to check this, we need to analysē Since lim Q→∞ J (Q, z 0 ) = 0 and lim Q→∞ Q 2 J (Q, z 0 ) = 0, the first three terms in the curly brackets do not survive in the large-Q limit. The last term can be formally written as where L is the differential operator introduced in (22) and G its Green function as defined in (32).
In the limit of interest for the Melnikov-Vainshtein constraint, one has z z, thus (32) reduces to (L + q 2 )G(z, z ; q) = 0; hence, effectively LG(z, z ; 0) = 0 and we obtain At parametrically small z = ξ/Q and z = ξ /Q 3 and at q 2 = 0, (32) reduces to provided ∆ − > 0. In the massive HW models with M 2 X = −3, this gives G(0; z, z ) → −M 2 q ln max(z, z ) + const., (54) leading to Thus the summation of the infinite tower of pions does give a different asymptotic behavior than that of individual contributions, but only in the form of a logarithmic enhancement.
In Fig. 3, the contributions of the first four pion modes to , the left hand side of (55) with another factor of Q 2 3 , is plotted for the HW1m model with Q = 200 GeV and increasing Q 3 . Numerically, this is dominated by the contribution from the lightest pion which completely swamps the logarithmic term (55) whose prefactor M 2 q /(3π 2 ) is of the order ∼ 3 · 10 −6 GeV 2 . Note that this term is suppressed by an extra power of quark mass compared to the contribution from the magnetic susceptibility [55,71] to the asymptotic behavior ofΠ 1 worked out in Ref. [23]. In Fig. 4 the same is plotted with quark masses increased by a factor of 25 (corresponding to a pion mass of about 750 MeV), where the slow build-up of a logarithmic term becomes apparent.
When M 2 X < −3, the logarithmic enhancement disappears and such that Only for ∆ − = 0, which is at the border of the allowed range M 2 X ∈ (−4, 0), would the infinite tower of massive pions start to contribute to the Melnikov-Vainshtein constraint, exactly when the result (49) for the infinite tower of axial vector mesons would break down. However, in the following applications the phenomenologically interesting generalizations of M 2 X all have M 2 X < −3, where the contribution from the pseudoscalar tower to the longitudinal short distance constraint is suppressed by two inverse powers of Q 3 without even a logarithmic enhancement.
In the chiral limit, the massive pions still contribute toΠ 1 , even though they decouple from the axial vector current and from the axial anomaly. With strictly M q = 0, so in this case there is no enhancement from the summation of the infinite tower, irrespective of the value of M 2 X .
In the symmetric limit Q 2 1 = Q 2 2 = Q 2 3 m 2 ρ , operator product expansion and LO pQCD imply that the longitudinal short distance constraint is 2/3 of value appearing in (45) [8,23], In this case, the same derivation as above leads to which reproduces the correct power behavior, but for g 5 = 2π, as demanded by (14), the numerical value is only 81% of the result (59). Again, this result for the contribution of the infinite tower of axial vector mesons is the same for chiral and for massive HW models. The asymptotic contribution of the pseudoscalar tower is still given by the expression (51), but the argument given thereafter does not apply in the symmetric limit. However, numerically we found no evidence of an enhancement of the asymptotic behavior due to the summation of the infinite tower of massive pions beyond what is seen in the asymmetric case, see Figs. 3 and 4.

V. NUMERICAL RESULTS
In the following we compare the different HW models numerically, in particular with regard to the contribution of the infinite tower of massive pions to the hadronic light-by-light scattering amplitude and thereby to the anomalous magnetic moment of the muon.
In all models we have chosen g 5 = 2π, ensuring an exact fit of the asymptotic LO pQCD result (14), but in the later discussion we shall also consider relaxing this constraint to account for the fact that at any large but finite energy scale there is a nonnegligible reduction of TFFs of the order α s /π that in AdS/QCD could perhaps be simply 3 modelled by a small reduction of g 5 . The low-energy parameters f π and m ρ are always fitted to their physical values, fixed to 92.4 MeV and 775 MeV, respectively, to be consistent with our previous work [34].
We consider two possibilities (HW1, HW3) for boundary conditions at z = z 0 [always given by (11)], and also alternatively the standard choice M 2 X = −3 and having M 2 X as a tunable parameter. We thus employ four different HW models with nonvanishing light quark masses.
HW1m: is the direct extension of the chiral HW1 model employed in our previous studies [34,41].
It coincides with model A of Erlich, Katz, Son and Stephanov [38] except that we have fitted to the mass of the neutral instead of the charged pions; HW1m': deviates from the standard choice M 2 X = −3 in order to attempt a better fit of the masses of the first excited pion and/or the lightest axial vector meson. It turns out that only the latter can be matched to the a 1 mass. The mass of the first excited pion is then also reduced compared to the pristine HW1 model, from around 1900 to 1600 MeV, but the target of 1300 MeV cannot be reached.
HW3m: uses the standard choice M 2 X = −3, but boundary conditions as proposed in [51], which have the advantage of making a manual subtraction of infrared boundary contributions in the Chern-Simons action unnecessary; HW3m': uses additionally M 2 X as a free parameter, which in this model achieves a fit of the mass of the first excited pion of 1300 MeV, as was the main motivation put forward by Domènech, Panico, and Wulzer in Ref. [51] for proposing this kind of model. Our HW3m' slightly deviated from the parameters used in Ref. [51] because we fitted f π , m ρ , m π 0 , and m π(1300) instead of performing a least-squares fit over a larger set of low-energy parameters.
For the chiral limit, we only consider the HW1 model, where we update the results obtained for pions in [41] 4 and recapitulate the results for axial vector mesons in [34].

A. Masses
As Table I shows, in the chiral HW1 model the mass of the lightest axial-vector multiplet is above the physical masses [72] M a 1 (1260) = 1230 (40) MeV and M f 1 (1285) = 1281.9(5) MeV, but below that of the f 1 , M f 1 (1420) = 1426.3(9) MeV. While this prediction of the HW1 model is in the right ballpark, the mass of the first excited pion is 1899 MeV and thus significantly above M π(1300) = 1300 ± 100 MeV.
With nonzero light quark masses, the HW1m model has slightly reduced axial-vector meson mass and increased excited pion masses. In the HW1m', where the a 1 mass can be matched, the lowest excited pion mass is reduced to about 1600 MeV.
With HW3 boundary conditions, there is additional chiral symmetry breaking and therefore a larger difference between the vector and axial vector meson masses, so that m a 1 is now even above the mass of the physical f 1 , but the excited pion mass is lowered substantially compared to the HW1m model, while still being too high. In the HW3m' model, where the first excited pion can be brought down to 1300 MeV, the axial vector meson mass is then also lowered, but somewhat larger than in the HW1 models. Even in the HW3m' model, where the pseudoscalar masses are the smallest, the second excited (n = 3) pion has a mass higher than the next established pion state 5 π(1800) and instead close to the next (less established) state π(2070); in the other models m n=3 is far higher.

B. Decay constants
The results for the decay constant of the lowest excited pion in the massive HW models span the range (1.56. . . 1.92) MeV, with the HW3m' model, where the mass of π(1300) can be fitted, yielding the largest value. These values are well below the existing experimental upper bound of 8.4 MeV [73]; the result for the HW3m' model is actually fully consistent with the value 2.20 (46) MeV of Ref. [74] that was adopted in Ref. [25]. 6 The holographic results for the decay constant of the lightest axial vector meson, defined in analogy to (13), read F a 1 = (493 MeV) 2 in the chiral HW1 model and (426 . . . 506 MeV) 2 for the different massive HW models. Note that in the literature frequently the mass of the axial vector meson is factored out [77]. 7 In Ref. [78] a value of F a 1 /m a 1 = 168(7) MeV has been obtained from light-cone sum rules. With F a 1 /m a 1 = 177 MeV for the chiral HW1 model and (148. . . 195) MeV for the different massive HW models (see Table I), the ballpark spanned by the holographic results is broadly consistent with that.

C. Comparison of transition form factors
In Figs. 5 and 6 the single and double-virtual pion TFF following from the chiral and the massive HW models are compared with each other and with the results obtained in [21] in the dispersive approach [70]. The HW results all lie within the error band of the dispersive result, mostly above its central value, with only HW1m' below.
Table I also lists the amplitude F πnγγ for pseudoscalar decays into two real photons. For the ground state pion there is only a tiny change when finite quark masses are introduced, across all massive HW models. For excited pions there is more variation; for the first excited pion the range of |F π 2 γγ | is (0.196. . . 0.250) GeV −1 . In the HW3m' model, where one can fit to m 2 = 1300 MeV, the value is 0.206 GeV −1 . At present, no direct measurements are available, but data on certain branching ratios permit to derive an estimate for an upper bound [25] reading |F π(1300)γγ | < 0.0544(71)Gev −1 . Evidently, the holographic results strongly overestimate the twophoton coupling of the first excited pion. This could be taken as a hint that it is better to choose a model where the masses of the lightest axial vector mesons can be fitted to experiment, namely the HW1m' model, since the two-photon couplings of axial vector mesons play a more important 5 Note, however, that this state is sometimes considered to be a non-qq state [72]. 6 The decay constants of the higher pion modes fall off with increasing mode number n, inversely proportional to m x n , where x is smaller than 1, which is very different from the value n = 2 of [75] and closer to but still not agreeing with [76] where x = 1. However, the HW models lack linear Regge trajectories; soft-wall models may be more realistic here. 7   role altogether. The holographic results obtained for the whole tower of excited pions could still be taken as a rough estimate of (perhaps an upper bound of) this contribution in real QCD. It is certainly also conceivable that the contributions of individual modes are overestimated while their combined contribution is closer to real QCD, where the mass spectrum of excited pions is denser than in the HW models. Table I also shows that the chiral HW1 model has almost identical results for F πnγγ as the HW1m model. As emphasized in Ref. [25], the vanishing of the decay constants of massive pions in the chiral limit does not preclude a coupling to photons. The former only means decoupling of the massive pions from the axial vector current and the axial anomaly. Thus massive pions contribute to the HLBL scattering amplitude also in the chiral HW1 model and therefore to the anomalous magnetic moment of the muon. 8 Fig. 7 displays the single and double-virtual TFFs for the first excited pion in the various HW models. Also shown are the corresponding quantities in the Regge model of Ref. [24,25] (RM) and its modification according to App. E therein (RM'). The Regge model TFFs satisfy the experimental upper bound F π(1300)γγ , but do not obey the single-virtual Brodsky-Lepage limit with the decay constants of excited pseudoscalars. The HW results have a much larger amplitude at vanishing virtualities, but decay faster. They have a zero and a sign change before approaching  the asymptotic behavior (41). The TFFs of higher excited pions have more zeros and reduced asymptotic limits due to smaller decay constants. In Fig. 8 the shape of the axial vector TFF A(Q 2 1 , Q 2 2 )/A(0, 0) is shown in the single-virtual and in the symmetric double-virtual cases. In the single-virtual case we also show the experimental result obtained in [42] for the f 1 (1285) meson which is found to be remarkably compatible, in particular for the HW1m' model, where the mass of the lightest axial vector meson can be fitted.

D. HLBL contribution to a µ
In Table I we also give the holographic results for the contributions to a µ , the anomalous magnetic moment of the muon, from the first few states of the pion and axial vector meson towers; in Fig. 9 the results for the π 0 and a 1 sector are shown in form of a bar chart.
In the chiral HW1 model, the chiral TFFs for the pion have been combined with a pion propagator where the physical mass of π 0 has been inserted by hand. The result of a π 0 µ = 65.2 × 10 −11 is remarkably close to that obtained in the massive HW models, which together span the range a π 0 µ = (64.3 . . . 66.6) × 10 −11 . The results of the massive HW1m are very close to those of the chiral HW1 model, also for the other contributions. Somewhat more variation is obtained in the other models, where either different boundary conditions or different values of M 2 X are employed. Table II shows the sums of the contributions in the different sectors, where we have made a numerical estimate of the limit value when the infinite tower of axial vector mesons is included as in [34]; in the case of the excited pions, the contributions of the higher modes fall off very quickly so that we have just summed the first few modes.
For the contributions of the excited pions we obtain the range a π * µ = (0.8 . . . 1.8)×10 −11 with the chiral model being at the lower end. Even though we have seen above that the HW models appear to severely overestimate the two-photon coupling F π 2 γγ when comparing with the upper limit [25] model where a µ(L) denotes the longitudinal contribution only. Here π and a 1 refer to the entire tower of pions and a 1 mesons; π * only to the heavy pions and a * 1 only to excited axial vector mesons. A and P * refer to a whole U (N f = 3) multiplet of axial vector mesons and excited pseudoscalars, where the contributions from the former are split into longitudinal (L) and transverse (T) parts. In the present flavor-symmetric case a A µ ≡ 4a a1 µ and a P * µ ≡ 4a π * µ .
|F π(1300)γγ | given above, the holographic results are somewhat below the contributions from the first few excited states obtained in [25] with large-N c Regge models. 10 In Ref. [41] also the contributions from the η and η pseudoscalars were estimated on the basis of the chiral HW1 model; we defer a precise evaluation of those in the massive HW models to future work where we plan to study the flavor-asymmetric case together with the contributions from the Witten-Veneziano mechanism for implementing the U (1) A anomaly. In the present flavor-symmetric setup, we would simply estimate the contribution of a whole U (N f = 3) multiplet (P * ) as a P * µ ≡ 4a π * µ = (3.4 . . . 7.2) × 10 −11 . The much higher contributions from the infinite tower of axial vector mesons, which in the chiral HW1 model reads [34]   in the various HW models, with excited modes given by increasingly darker colors, blue for the π 0 's, red for the a 1 's.
Generally we find that the contributions from excited axial vector mesons are more important than excited pions, corresponding to the fact that only the infinite tower of the former plays a role in satisfying the LSDCs (which is completely satisfied in the asymmetric Melnikov-Vainshtein case, and at the level of 81% in the symmetric case). Massive pions already contribute in the HW1 and HW3 models in the chiral limit; away from the chiral limit their importance is not increased, despite the different asymptotic behavior of their summed contribution in the HW1m and HW3m cases where one has a logarithmic enhancement of the TFFs. Correspondingly, the contributions of the axial vector tower are not reduced; in fact, they tend to be higher with nonzero quark masses.

E. Discussion
Since we have seen above that the holographic results for the low-energy observables F π 2 γγ and A 1 (0, 0), the latter determining the equivalent two-photon rate of the lightest axial vector meson, are larger than indicated by experiments, the corresponding holographic results may perhaps be viewed as upper limits. In the following we investigate whether this situation improves when one tries to accommodate corrections to the high-energy behavior. In fact, the holographic HW models we have considered here have no running coupling constant; the TFFs reach their asymptotic UV limits somewhat too quickly.
In order to derive more plausible extrapolations to real QCD, we have considered a reduction of the value g 2 5 by 10% and by 15%. This brings the asymptotic behavior of the TFFs down by amounts that are roughly consistent with perturbative corrections to the leading-order pQCD results at moderately high Q 2 values [80,81]. At the same time, the right-hand side of (14) is increased by a similar amount, which is consistent with the next-to-leading order terms in this expression [82].
With 10% reduction, the HW model results for the pion TFF also get closer to the central result of the dispersive approach at all energies, while with 15% they are generally somewhat lower. In Table III we have listed the reduction factors resulting for A 1 (0, 0) and various a µ contributions in the chiral HW1 model which we assume to be a good approximation in general. Applying the stronger reduction factors to the minimum values of the results of the massive HW models, we obtain the range in remarkable agreement with recent evaluations using the data-driven dispersive approach [21], where a π 0 µ = 62.6 +3.0 −2.5 × 10 −11 , which has also been backed up by lattice QCD [22]. Doing the same for excited pseudoscalars and axial vector mesons, we obtain the following ranges as our predictions The latter result could be compared to the White Paper [4] values attributed to the axial sector and contributions related to the SDC, a WP,axials µ = 6(6) × 10 −11 and a WP,SDC µ = 15(10) × 10 −11 , which with linearly added errors gives 21(16) × 10 −11 , which is significantly smaller.
In Ref. [46], a model-independent estimate of the effects of the longitudinal short-distance constraints on the HLBL contribution to a µ has been proposed with the result ∆a (3) µ = 2.6(1.5) × 10 −11 for the isovector sector. In Table II we have also listed the results for the longitudinal part of the contributions from the excited pions and the a 1 tower. Extending the range of the holographic results by the above reduction of the lower end, we obtain a π * ∪a 1 µ(L) = (6.0 . . . 8.1)×10 −11 which is significantly higher. However, excluding the ground-state axial vector meson, whose mass is close to the matching scale used in [46] and which (like any single excitation) does not contribute to the asymptotic value of the TFFs, we just have a π * ∪a * 1 µ(L) = (1.7 . . . 3.7)×10 −11 , in perfect agreement with ∆a (3) µ of Ref. [46]. Including singlet and octet contributions, Ref. [46] has estimated ∆a (0)+(3)+(8) µ = 9.1(5.0)×10 −11 , which is 3.5 times the ∆a (3) µ result. In our flavor-symmetric models, we would have to multiply our results by a factor of 4, giving (6.9 . . . 15.0) × 10 −11 , which is still agreeing well. It would be interesting to revisit the additional study in Ref. [46] where the HW2 results for the lightest axial vector meson were included and a result was found that exceeded the contributions of the remaining tower.
Another method to estimate the effects of the LSDC has been used in [24,25], where a Regge model of excited pseudoscalars has been tuned to reproduce the LSDC. Although we have found in the holographic models that also away from the chiral limit it is the axial vector mesons that are alone responsible for the LSDC, the recently updated estimate obtained in [47], ∆a LSDC µ = 13(5), is fully consistent with our conclusions, as illustrated in Fig. 10.
The most important contribution missing in previous evaluations of the HLBL piece of a µ , if the holographic HW models are to be trusted, are those from the ground-state axial vector mesons. With the reduction of g 5 , also the low energy end of the axial vector TFF gets modified appreciably, expanding our range of predictions to |A(0, 0)| n=1 = (17.3 . . . 21.3) GeV −2 , which now has overlap with the experimental values quoted above in Sec. V C. The lowest value (which also fits the experimental data best) is in fact obtained in the HW1m' model, where the mass of a 1 (1260) can be fitted, yielding 11 a A 1 µ = 25.9 × 10 −11 . In this model, however, the n = 2 axial vector TFF has a larger value than in the other models, and its contribution to a µ is also fairly large, so that the second lightest axial vector multiplet alone contributes another a A 2 µ = 7.5 × 10 −11 , despite having a mass much higher than that of established excited axial vector mesons. As we have already remarked, the total contribution of the axial vector tower is the largest of all HW models (see Table II). With maximal reduction of g 2 5 it still yields a A µ = 40 × 10 −11 , coinciding with the central value of the range given in (62).
All in all, the holographic HW models that we have considered here point to 20 × 10 −11 of extra contributions in the HLBL part of a µ compared to the axial-vector and SDC pieces in the White Paper value [4]  contributions. Such sizable upwards corrections are in fact compatible with the recent complete lattice calculation [28] with comparable errors, which obtained a HLBL,lattice µ = 106.8(14.7) × 10 −11 .

VI. CONCLUSION
In this paper we have studied various AdS/QCD models with a hard wall and with a bifundamental scalar which permits to introduce finite quark masses and thus to extend our previous work on hadronic light-by-light contributions in holographic QCD away from the chiral limit. In the latter it was shown in Ref. [34,35] that summation of the contributions of the infinite tower of axial vector mesons changes the asymptotic behavior of the HLBL scattering amplitude of individual contributions precisely such that the Melnikov-Vainshtein LSDC is satisfied. By contrast, in [24,25] a Regge model of excited pseudoscalars has been constructed to achieve the same. Turning on finite quark masses, we found that in the holographic models the infinite tower of excited pseudoscalars, which is already present in the chiral limit and in fact has nonvanishing two-photon couplings despite vanishing decay constants, couples to the axial anomaly and then can lead to a certain enhancement of their contribution to the asymptotic HLBL amplitude, but never enough to contribute to the leading terms of the LSDCs. With the standard choice of the holographic mass of the bifundamental scalar, which determines the scaling dimension of quark masses and chiral condensates, this enhancement is merely logarithmic; generalizations are possible, where also power-law enhancements arise, but still below what is relevant for LSDCs at leading order.
We have also considered the numerical consequences of introducing finite quark masses on the results obtained previously in the chiral limit, for simplicity only in the flavor-symmetric limit so that the π 0 and the a 1 sector can be covered, leaving the N f = 2 + 1 case and also the consideration of the Witten-Veneziano mechanism for the U(1) A anomaly to future work. Doing so we have explored the two sets of boundary conditions that are possible in hard-wall models, and we have also considered the generalization of modified scaling dimensions of quark masses and chiral condensates proposed in [51], which permits to fit either the mass of the first excited pion or the mass of the lowest axial vector meson.
As displayed in Fig. 9, the massive HW models lead only to small (positive) changes in the contributions to a µ from the π 0 and a 0 towers compared to the chiral HW1 model, when in the latter the physical pion mass is inserted manually in the pion propagator. Compared to experimental data, the two-photon couplings of the lowest axial vector meson, which is responsible for the second-largest contribution besides the ground state pion, is somewhat too large, but it becomes consistent with experimental constraints when the five-dimensional coupling is adjusted such that the LO pQCD values of TFFs is reduced by amounts corresponding to typical α s corrections at moderately large energies. However, also after such adjustments, the contributions from the axial vector and excited pseudoscalar mesons ( 40 × 10 −11 ) are significantly larger than in the model calculations that have been used to assess their role and also the effect of SDCs in the White Paper [4], where a axials+SDC µ = 21(16) × 10 −11 .
When M q Σz 2 0 , the weight function z/β(z) is concentrated at small z, where its would-be divergence is cut off by M q . In this limit, one can approximate the normalization condition (27) by replacing y 2 n by its boundary value g 2 5 f 2 πn and the upper limit of the integral by infinity, yielding where For sufficiently small M q we thus obtain For massive pions, i.e., for n > 1, where m n approaches a nonzero value in the chiral limit, this implies that f πn → 0, so they decouple, while the lightest pion with f π 1 = f π gives rise to the Gell-Mann-Oakes-Renner relation f 2 π m 2 π = 2M q Σ q for α = 1, while for α = 1 one should perhaps rescale M q and Σ before interpreting them as quark mass and condensate. (The scaling factor mentioned in footnote 2 drops out here.) While y 1 always satisfies the boundary condition z β ∂ z y 1 = 0 at z = with z/β ∼ z −1+2α as M q → 0, it does not satisfy such a boundary condition with β(z)| Mq=0 . From the point of view of the strictly chiral HW model, y 1 corresponds to a solution with the different boundary conditions that pertain to the one of profile functions y S (up to an overall factor). Nevertheless, it can still be normalized by (27), since with the help of the equations of motion the divergent integral times the vanishing mass can be recast as The holographic wave function of the massless pion can be given in closed form as the appropriate linear combination of the two Bessel functions In the special case of the chiral HW1 model with standard M 2 X = −3 and thus α = 1 the result reads [52]  The following table shows the changes brought about by a reduction of g 2 5 by 10% (HW1-) and by 15% (HW1--) in the chiral HW1 model.