Calculation of lepton magnetic moments in quantum electrodynamics: a justification of the flexible divergence elimination method

The flexible method of reduction to finite integrals, briefly described in earlier publications of the author, is described in detail. The method is suitable for the calculation of all quantum electrodynamical contributions to the magnetic moments of leptons. It includes mass-dependent contributions. The method removes all divergences (UV, IR and mixed) point-by-point in Feynman parametric space without any usage of limit-like regularizations. It yields a finite integral for each individual Feynman graph. The subtraction procedure is based on the use of linear operators applied to the Feynman amplitudes of UV-divergent subgraphs; a placement of all terms in the same Feynman parametric space is implied. The final result is simply the sum of the individual graph contributions; no residual renormalization is required. The method also allows us to split the total contribution into the contributions of small gauge-invariant classes. The procedure offers a great freedom in the choice of the linear operators. This freedom can be used for improving the computation speed and for a reliability check. The mechanism of divergence elimination is explained, as well as the equivalence of the method and the on-shell renormalization. For illustrative purposes, all 4-loop contributions to the anomalous magnetic moments of the electron and muon are given for each small gauge-invariant class, as well as their comparison with previously known results. This also includes the contributions that depend on the ratios of the tau-lepton mass to the electron and muon mass.


I. INTRODUCTION A. Quantum field theory, divergences and renormalization
It is well known that the first formulations of quantum field theory suffered from ultraviolet (UV) divergences.In 1947-1949 the first results appeared showing that the UV divergences in the physical observables are cancelled if we define the physical parameters properly.This demonstration was the result of the efforts of R. Feynman, J. Schwinger, H. Bethe, S. Tomonaga and others.A procedure for dealing with UV divergences in quantum electrodynamics (QED) at any order of the perturbation series was developed by F. J. Dyson [1] and A. Salam [2].A further development and a mathematical understanding of these ideas has led to N. Bogoliubov's R-operation.This operation can be represented as a subtraction procedure that deals with the Feynman amplitudes of the UV-divergent subgraphs in each Feynman graph that contributes to the needed probability amplitude; it removes all UV divergences in each Feynman graph.The application of the R-operation to Feynman graphs is equivalent to an introduction of counterterms into the Lagrangian; it can be treated as a renormalization of the theory parameters.The corresponding theorem was proved in 1956 by N. Bogoliubov and O. Parasiuk [3].This proof contains some inaccuracies, but nevertheless, the Bogoliubov-Parasiuk theorem set a standard of quality and raised the hope that quantum field theory has a meaning and can be rigorously examined.The shortcomings of the proof were corrected by K. Hepp in 1964 [4].
Not only the Bogoliubov-Parasiuk-Hepp theorem is of interest, but also its proof, because a stronger statement was proved: The UV subtractions lead directly to finite integrals in Schwinger parametric space 1 .In QED we have Feynman graphs with the propagators i( / q + m) q 2 − m 2 + iε , −g µν q 2 + iε for the lepton and photon lines correspondingly, where / q = q ν γ ν stands for the Dirac gamma matrices γ ν .If we transfer to the Schwinger parameters by using the formula we express the Feynman amplitude as an integral z1,...,zL>0 I(z 1 , . . ., z L , p 1 , . . ., p n , ε)dz 1 . . .dz L , where p = (p 1 , . . ., p n ) is the vector of external momenta, z = (z 1 , . . ., z L ) is the vector of Schwinger parameters, z j corresponds to the j-th internal line of the graph.If we apply the R-operation, I(z 1 , . . ., z L , p 1 , . . ., p n , ε) is expressed as a linear combination of terms; each term is a product of some functions, each of which corresponds to a Feynman amplitude that uses a subset of the set of internal lines; if the same mapping is used between z j and the internal lines, all the counterterms are placed in the same Schwinger parametric space.A byproduct of the Bogoliubov-Parasiuk-Hepp theorem is that (2) is finite for any ε > 0.
Unfortunately, these subtractions in Schwinger's parametric space are useless for calculations.The reason is the inability to handle the limit ε → 0. Infrared (IR) divergences and their cancellation have been widely discussed in literature.However, a rigorous examination for the general case has not yet been done.At higher orders IR and UV divergences mix with each other2 .In the case of magnetic moments of leptons, all IR divergences are removed by the physical on-shell renormalization as well as the UV and mixed ones 3 .A general scaterring process requires consideration of a finite photon detector sensitivity in addition to renormalization.The on-shell renormalization can be performed in place in Feynman graphs by a simple modification of the R-operation.In the case of the lepton magnetic moments, it removes all divergences in the final result.However, the individual integrals remain IR divergent.
The combinatorics of the IR and mixed divergences in Feynman graphs is very complicated.For this reason, their subtraction before integration is very rarely used in modern quantum field theory computations.Nowadays, most calculations use dimensional regularization, which allows us to work directly with infinite components without exact mappings for cancellation.However, the use of dimensional regularization requires an enormous amount of symbolic manipulation at higher orders.An experience with calculations of the lepton magnetic moment shows that the methods that substract divergences before integration work much faster and allow us to obtain high-order corrections that are not achievable with other methods.

B. Anomalous magnetic moments of leptons and their calculations
The anomalous magnetic moments (AMM) of the electron and muon are known with a very high precision.A recent measurement [10] gave the result a e = 0.00115965218059 (13) ( Standard Model predictions for the electron AMM a e use the following expression: (m e /m µ ) + A (m e /m τ ) + A where m e , m µ , m τ are the masses of the electron, muon, and tau-lepton, respectively, α is the fine-structure constant; a similar expression is used for the muon AMM a µ .
The universal QED terms A (2n) 1 (α/π) n form the main contribution to a e and a µ .The value was obtained in 1948 by J. Schwinger [11,12].The 2-loop contribution A 1 was calculated mainly by R. Karplus and N. Kroll [13].However, this calculation had an error; the correct value was independently presented by A. Petermann [14] and C. Sommerfield [15] in 1957.The value of A (6) 1 was being calculated in 1970-x by various research groups using numerical integration ( [16,17]; [18]; [19]); each of these groups used its own method of divergence elimination at the integrand level; the most accurate value A was obtained by S. Laporta and E. Remiddi in 1996 [20].This value was the result of the efforts of many researchers (see, for example, [21][22][23][24][25][26] ).First numerical estimates for A (8) 1 were obtained by T. Kinoshita and W. B. Lindquist in 1981 [27].The most accurate value presented by T. Kinoshita's team, A 1 = −1.91298(84),was published in 2015 [28].This value was obtained by Monte Carlo integration.The semianalytic result of S. Laporta was obtained in 2017 [29].These two calculations of A (8) 1 agree well, as do another independent calculations [30,31] and our calculation of Feynman graphs without lepton loops [32].First overall calculation results for A (10) 1 were published in 2012 by T. Aoyama, M. Hayakawa, T. Kinoshita, M. Nio in [33] 4 .The last value obtained by this group in 2019 [34] is [AHKN] = 6.737(159). ( This coefficient has not yet been verified, and a significant computational error could be apparent in experiments.We recalculated the total contribution of the graphs without lepton loops to A (10) 1 and presented [35] in 2019 the value A [no lepton loops,Volkov] = 6.793(90) that leads to A [Volkov+AHKN] = 5.862(90) (5) with a discrepancy of 4.8σ with (4).At the moment the discrepancy is unresolved, but independent calculations are coming [36].The values (4) and ( 5) in combination with the experimental value (3) and another known contributions [34] lead to and correspondingly.The values obtained from the measured ratios of the atomic masses and the Planck constant (11) come from [37], [38], and [39], respectively.Note that α −1 [Rb-2020] is the largest among these three values and has a discrepancy of 5.4σ relative to α −1 [Cs-2018].The tensions with (6) are 1.97σ, 3.86σ, 2.09σ; the corresponding tensions with (7) are 1.89σ, 3.64σ, 2.46σ.Other results exist for small classes of 5-loop and higher order graphs [40][41][42]; they are in good agreement with the ones mentioned above.
An analytic formula for A (4) 2 (x) was first established by H. H. Elend [43] in 1966.Numerical values for A 2 appeared about 1969 [44].Most attention was paid to A (6) 2 (m µ /m e ), because its value A (6) 2 (m µ /m e ) ≈ 20 became unexpectedly large.These values were also being calculated analytically in the form of an expansion in the mass ratio.These expansions are being published since 1975 with an increasing number of terms [45][46][47][48][49].The values A (6) 3 were also evaluated numerically [50], semianalytically [46] and analytically [51] as expansions in the mass ratios.First numerical values of A (8) 2 (m µ /m e ) appeared in 1990 [50].The recent values of A 3 for the electron and muon obtained by T. Aoyama, M. Hayakawa, T. Kinoshita, M. Nio with the help of Monte Carlo integration are presented in [28,52].Semianalytic calculations using expansions in mass ratios confirmed these results [53,54].Other results for certain classes of graphs are also in good agreement with them [55][56][57][58][59][60][61].The only known values of A for the electron and muon were presented in 2012 by T. Aoyama, M. Hayakawa.T. Kinoshita, M. Nio [33,52].Partial calculations of small graph classes confirm these results [42,46,57,62].However, a very important contributing to a µ value (m µ /m e ) = 742.18(87) has not been double-checked yet.It is much larger (in absolute value) than the corresponding electron value.A significant error in this value could be noticeable in experiments, and the shift would be comparable to the hadronic uncertainty.In addition, rapid growth of A (m µ /m e ) with n may cause the higher-order terms to significantly affect the result.However, estimates based on known lower-order values and renormgroup-inspired arguments [50,52,[63][64][65], as well as non-relativistic calculations [66], show that both possibilities seem unlikely.

C. Methods of numerical calculation and gauge-invariant classes
Methods based on subtraction of divergences before integration have been widely used for calculations of QED contributions to magnetic moments of leptons [16,18,19,34] and for other quantum field theory calculations [68].
In 2016, we presented a method [69] suitable for computing A (2n) 1 for arbitrary n.The difference of this method from the previously known methods is that it is based on linear operators applied to the Feynman amplitudes of UVdivergent subgraphs, and the final result is obtained directly by summing the contributions of the Feynman graphs (no residual renormalization is required).
A distribution of the final result over Feynman graphs is a matter of free choice, but it is very untrivial to realize this freedom in the context of methods that subtract divergences before integration and yield the final result without residual renormalization, because the integrals obtained must be finite for each individual Feynman graph, but the whole procedure must conform to the physical on-shell renormalization.In a sense, this subtraction procedure must contain an evidence that the on-shell renormalization removes all divergences, including infrared and mixed divergences.In 2021, we presented a modification [70] of the developed method; it allows us to choose 3 linear subtraction operators independently 5 .Moreover, the new method is suitable for the calculation of mass-dependent . Some classes of Feynman graphs form gauge-invariant classes.For example, any class of graphs closed under a motion of the internal photon lines along the lepton paths and loops, but without jumping over the external photon line, is invariant in the class of Lorentz-invariant gauges with the photon propagators of the form −(g µν +k µ k ν f (k 2 ))/(k 2 +i0), if we properly define the on-shell renormalization for subclasses of graphs 7 8 , see Section IV A.
Examples of the use of the new method, as well as a comparison with the old method, were given in [70,71].In this work we show that the method yields the exact value for each gauge-invariant class mentioned above, including those contributing to A , as well as a finite integral for each Feynman graph.

II. THE METHOD FORMULATION
We work in the unit system where = c = 1, the factors of 4π appear in the fine-structure constant: α = e 2 /(4π), the tensor g µν corresponds to the signature (+, −, −, −), the Dirac matrices satisfy the condition γ µ γ ν + γ ν γ µ = 2g µν .We also suppose e = 1 for the lepton electric charge; since we work with each term of the perturbation series separately, it changes nothing, but makes the expressions shorter.
We extract the lepton AMM from QED Feynman graphs with N l = 2, N γ = 1, where by N l and N γ we denote the number of external lepton and photon lines in the graph; each graph may contain electron, muon, tau-lepton lines.We also assume that all graphs are one-particle irreducible and have no odd lepton loops (Furry's theorem).
We work in the Feynman gauge with the propagators (1) for leptons and photons, respectively, where m is the lepton mass.
Two subgraphs are said to overlap if they are not contained in each other and the intersection of their line sets is not empty.
A set of subgraphs of a graph is called a forest if any two elements of this set do not overlap.
For a vertexlike graph G, we denote by F[G] the set of all forests F consisting of UV-divergent subgraphs of G and satisfying the condition G ∈ F .The lepton path of a graph G connecting its external lepton lines is called the main path of G.By I[G] we denote 11 the set of all vertexlike subgraphs of G (including G) which have the vertex incident to the external photon of G and at least one vertex of the main path of G.
To define the subtraction procedure, we use the linear operators labeled A, U 0 , U 1 , U 2 , U 3 , L; sometimes the same names are used for operators applied to the Feynman amplitudes of subgraphs of different types.The definitions are: • A is applied to vertexlike Feynman amplitudes 12 Γ µ (p, q) and is defined as where A ′ is the projector of the AMM.See the definition of the projector in [35,69].
• L is the standard on-shell renormalization operator for vertexlike graphs; it is defined as • U 0 is applied to photon self-energy and photon-photon scattering subgraphs and works as in standard renormalization.For the Feynman amplitude Π µν (p 2 ) of the photon self-energy, we can take, for example, where a scalar product is implied, and Optimizations as described in [72] can also be used for photon self-energy graphs; the choice of the photon self-energy renormalization method does not play an important role in our subtraction procedure.For example, 9 We consider only those subgraphs which are one-particle irreducible and contain all lines connecting the vertexes of the given subgraph; since odd lepton loops are forbidden, a UV-divergent subgraph is one-particle irreducible if and only if it is amputated.Photon-photon scattering subgraph divergences cancel in the final result without subtraction, but they remain in the individual graphs.
The definition differs from that in [69].p − q 2 , p + q 2 are incoming and outgoing lepton momenta, q is the photon momentum.
a more efficient method was used for the calculations; see Section V.However, the definition given here is convenient for proofs.
For the Feynman amplitudes of photon-photon scattering, we can take where zero 4-momenta are denoted by 0. It also can be defined with vector-valued operators U ′ 0 as it was done for photon self-energy Feynman amplitudes.
• U j , j = 1, 2 are applied to vertexlike and lepton self-energy Feynman amplitudes; U 3 is applied only to vertexlike amplitudes.They are defined as where Γ µ (p, q) and Σ(p) are vertexlike and lepton self-energy Feynman amplitudes, U ′ j are number-valued linear operators, We require that: is also satisfied.The latter can be rewritten as ).We emphasize that the same symbol U j is used for lepton self-energy and vertexlike Feynman amplitudes; the definitions for the different types are independent from each other, the requirements connect them.
These conditions are sufficient to prove combinatorially that the subtraction procedure defined below is equivalent to on-shell renormalization, provided that all regularization issues are resolved.
To obtain finite integrals for each graph, more stringent requirements are needed.We take where (8), ( 13) are satisfied, M 2 is an arbitrary real number (it may be different for different operators).As U ′ 3 Γ we can take a(M 2 ) or L ′ Γ. Being an operator means that it is applied to a function and depends only on this function (it does not depend on the internal structure of the corresponding subgraph).However, the same symbol U j is used for linear operators that are applied to Feynman amplitudes of different types (a type is the set of the external line types); the linear operators under the same symbol U j for different types can be chosen independently.Of course, U j can be chosen independently for different j.U 1 and U 3 are defined only for the Feynman amplitudes with the external leptons coinciding with those whose AMM is calculated; U 2 are defined for all types of external leptons (for example, it can be chosen independently for the Feynman amplitudes with external electrons, muons and tau-leptons).
The expression for the subtraction and extraction of the AMM corresponding to a Feynman graph G is where and G ′′ has its external leptons on the main path of G, the index of an operator denotes the subgraph to whose Feynman amplitude it is applied; , where V (G) denotes the set of vertices of G.We emphasize that the whole graph G is used in the definition of ξ; thus, ξ does not depend on G ′ and G ′′ .At the level of Feynman amplitudes, each term of the expression sequentially transforms the Feynman amplitudes of subgraphs using corresponding linear operators (the subgraphs should be ordered by inclusion in this sequence, from smaller to larger).Finally, the coefficient before γ µ should be taken.For example, for the graph G from Fig. 1 the expression is where G e = aa  Another example is the graph G from Fig. 2. In this case, I[G] = {G}, the expression is The operator U 1 is applied to the subgraph cdef ghi, because its external leptons are on the main path of G. To obtain finite Feynman parametric integrals, all terms should be placed in the same Feynman parametric space.The algorithm for obtaining integrals from expressions is briefly explained in [71] and in detail in [69].The needed value of A (2n) l is simply the sum of the integrals corresponding to the Feynman graphs involved.

III. DIVERGENCE ELIMINATION
The principle of divergence elimination in the method is similar to that explained in [69] for the old method.Here we describe schematically how this choice of linear operators allows us to separate divergences and how it leads to divergence cancellation.
Let us consider the 1-loop unrenormalized vertexlike Feynman amplitude13 and the fact that p µ commutes with all other multipliers that may appear in the expression, we obtain where A and B some IR-divergent values 14 .Thus, U 1 Γ 1 and U 2 Γ 1 are IR-finite.On the other hand, the UV-divergent part of Γ 1,µ (p, q) does not depend on p, q and is therefore proportional to γ µ .Thus, (L − U j )Γ 1 , j = 1, 2 are UV-finite.
There is a general observation: • (L − U j )Γ, j = 1, 2, 3 has no overall UV divergence for any vertexlike Feynman amplitude Γ (of any order); • U j Γ, j = 1, 2 does not have IR divergences at all, provided that the mass renormalization has been done properly (see below); • this is also true for U j Σ, j = 1, 2, where Σ is a lepton self-energy Feynman amplitude.
In particular, this means that U 1 , U 2 can be used to subtract UV divergences without generating additional IR divergences (unlike L).

FIG. 3. 2-loop examples
Consider the 2-loop ladder graph G in Fig. 3,left 15 .The corresponding expression is In this case, we have only logarithmic divergences.The overall UV divergence corresponds to In all terms it is cancelled by A since it is proportional to the Born amplitude (Aγ) µ = 0, where by γ we denote the 4-vector of Dirac gamma matrices; this can be shown using standard methods for studying IR divergences in QED.
There is also a mixed UV-IR divergence in this graph: k 1 → 0, k 2 → ∞; in this case, the UV divergence corresponding to abc is IR-infinitely enhanced by the remaining propagators.In the first 2 terms it is cancelled by A, because the UV-divergent part of abc is proportional to γ µ , so the "enhanced" value is proportional to (Aγ) µ = 0.In the remaining term, it is cancelled in A abc for the same reason.The remaining divergences 16 The difficulty in this case is the power-type IR-divergence corresponding to k 1 → 0 and a fixed k 2 = 0.However, it is eliminated by the mass subtraction part in U 1 .To make this possible, we give no freedom to the definition ( 14) of M ′ in (12).After the mass subtraction, the IR divergence becomes logarithmic and is removed by A, as is the overall UV divergence.The UV subdivergence of bc is removed by 1 − U 1 , and no additional IR divergence is generated.

FIG. 4. An example with U3 in the expression
Let us consider the graph G on Fig. 4 .The expression is The last multiplier eliminates the UV divergence corresponding to f ghi; in this case there are no problems with IR divergences.The last term of the first multiplier removes the IR divergence corresponding to k ae → 0 (the subscript denotes the graph line by its ends).The term A G (U 3 ) bcdf ghi removes the UV divergence corresponding to bcdf ghi.It is important to note that it does not create any additional IR divergence, although U 3 has more freedom in definition than U 1 and U 2 (we can take U 3 = L, for example).The reason is that the on-shell Feynman amplitude of bcdf ghi has no IR divergence, because its external photon is on a lepton loop.The cancellation of divergences in Feynman parametric space for an arbitrary graph G can be shown in a similar way as in [69], but U should be replaced everywhere by the corresponding U j .Let us note one more small change.Following [69], Appendix C, we assume that G where (15) is implied, we also assume that G 0 is the graph consisting of only one vertex of G joining the external photon (if ξ = 3, we use G 0 as the empty graph).By z l we denote the Feynman parameter corresponding to the internal line l.With the help of power counting, we investigate the case where β l ≥ 0 are some numbers corresponding to the internal lines.By P j we denote 17 the set of all lines l on the main path of G such that l ∈ G j+1 and l / ∈ G j .The cancellation of UV divergences and UV parts of mixed divergences is demonstrated in the same way 18 as in [69].If all divergences not belonging to I[G] are properly subtracted, a potentially IR divergent case corresponds to a vector of divergences [v 0 , . . ., v n ; w 0 , . . ., w n ], where v j ≥ w j ≥ 0, 2v 0 − w 0 = 2v 1 − w 1 = . . .= 2v n − w n > 0, there exists j such that w j = 0.This vector gives the values We will refer to the index j, 1 ≤ j ≤ n as a transition index for the divergence vector [v, w] if v j > v j−1 .Correspondingly, the index j is called an inverse transition index if v j < v j−1 .The difference with the investigation in [69] is the following: 1.An index j is called separating in a term of the expression if there is an operator A Gj , L Gj or (U 3 ) Gj (U 3 = L) in this term.The difference with [69] is the possibility to take U 3 (if it is equal to L).
2. If ξ = 3, then no consideration of the case v 0 > w 0 is necessary; this case cannot produce divergence in a term, because G 1 has its external photon on a lepton loop.
3. If ξ = 3 and U 3 = L, we rewrite the expression differently for a given index H of the inverse transition in [v, w]: The last term was added to (C.7) in [69], inappropriate (1 − U ) multipliers were removed.The factor L G l L GH − L G l cancels the divergence in this term, analogous to the other terms.
This argument shows how it works, but a full rigorous examination seems difficult and cumbersome; we rely on a numerical check (see Section V).

A. Definition
The definition of on-shell renormalization is well known for the coefficients of the expansion in α, but it requires a definition for subsets of Feynman graphs.We define it as the sum of the individual graph contributions obtained by the in-place on-shell renormalization.It can be written in terms of expressions such as those used above.The expression for a graph G is where where ( 13) and ( 14) are satisfied.
The multipliers S G ′ for photon-photon scattering graphs G ′ can be omitted in gauge-invariant classes as well as transformations like described in [72] can be applied for photon self-energy graphs.[69] for the old version of the method.In the new version each operator U should be replaced by U 1 or U 0 .

2-loop examples are given in
The Ward identities for individual Feynman graphs play an important role in proving the equivalence.We will illustrate this by the gauge-invariant class shown in Fig. 5.The investigation applies to both the new and old methods (the difference is only in the name of the U -operator).

FIG. 5. An example of a 3-loop gauge-invariant class
Table I provides the parts of the contributions, obtained with our method, that contain U -operators.The lowerorder Feynman amplitudes used in these expressions are shown in Fig. 6.The boxed dots represent the mass vertices (the vertices that give 1 in the Feynman rules).By Γ 1 and Σ 1 we also denote the 1-loop vertexlike and lepton selfenergy amplitudes (without special vertices), respectively.In this case, I[G] = ∅, and we have no IR subtractions; therefore, the remaining contributions are the same as those obtained by the in-place on-shell renormalization.
FIG. 6. Feynman amplitudes participating in the expressions for the set in Fig. 5 and the corresponding Feynman graphs After summation, we obtain We use the following Ward identities for individual graphs: Each identity contains one lepton self-energy graph and all vertexlike graphs obtained by inserting an external photon into a line on the main path.Each coefficient equals 1, but some of our coefficients equal 2, because we are working with undirected Feynman graphs.The identities can be proved by standard methods used in QED 19 .The corollary is From this it follows that all terms with U ′ 1 are cancelled.Similarly, we can prove that all terms with L ′ and B ′ are lifted in the on-shell expression, which implies the equivalence 20 .
Let us examine examples where the new and old methods work differently.We start with the 4-loop class IV(b) from Fig. 7,left.We suppose that the graphs contribute to the muon g − 2 and contain an electron loop. 19See [73]. 20provided that all regularization issues are resolved; dimensional regularization can be used in most cases.The difference is that the old method yields additional subtractions in graphs 2,10 because each of them has a vertexlike subgraph joining the external photon and not lying on the main path 21 .Also, the operators U 2 are used everywhere instead of U .The contribution of IV(b) is equal to e +(terms with A only) +(remaining terms with mass parts) The first term is cancelled because its second multiplier equals 0 due to the Ward identity; the light-by-light term is similar and is cancelled for the same reason.The expression for the on-shell renormalization has the same form, but the occurrences of U ′ 2 are replaced with L ′ , B ′ ; its terms already considered are cancelled due to the same reason.The remaining terms are identical.
Consider the class IV(c) from Fig. 7,right.In this case, we also assume that the graphs contribute to the muon g − 2 and have an electron loop.Graphs 7 and 8 give terms with U 3 .The total contribution of these graphs is We ignore the terms with U ′ 0 and M ′ in this investigation, because they do not cause any problems.To show that the first term equals 0, we need the Ward identities of the form Γ µ (p, 0) = 0 for single graphs.It holds for sets of vertexlike graphs that have the external photon on a lepton loop that are closed under the movement of it along the loop.The validity can be proved in the usual way.Since U 3 preserves this kind of Ward identities, its contributions are cancelled.The remaining graphs yield we get that all terms containing U ′ 1 are cancelled after summation.The total contribution obtained by the in-place on-shell renormalization is The difference between our and on-shell contributions equals It equals 0, because Let us summarize.To prove the equivalence, we need two kinds of the "Ward identities" for single graphs: • Γ µ (p, 0) = −∂Σ(p)/∂p µ ; the set includes one lepton self-energy graph and all vertexlike graphs obtained from it by inserting an external photon into a line on the main path.
• Γ µ (p, 0) = 0; the set consists of vertexlike graphs having the external photon on a lepton loop; it is closed under the movement of the external photon along the loop.
Graphs in the sets may contain special vertices.See also examples of the use of the Ward identities for individual graphs in Section IV.H of [32].

C. Complete proof
Suppose we have a class K of 1-particle irreducible vertexlike Feynman graphs that is closed under the movement of internal photons along lepton loops and paths, but without jumping over the external photon.We prove that its contribution obtained with our method is equal to the contribution obtained with the in-place on-shell renormalization.Since the ideas used here are similar to those widely used in QED, the explanation is just a framework.
Let us give some definitions.Suppose we have one term X of the expression with linear operators constructed for some graph G of this class.The subgraphs to which the operators in X are applied form a rooted tree; in this tree, G is a root, and applied to some graphs with special vertices.This expression can have several terms Y 1 , . . ., Y n , because U j Σ is split into U ′ j -part and M ′ -part.Each term Y j can be represented as a layer tree.A layer is a Feynman graph obtained from one graph to which an operator in X is applied by schrinking all its childs to special vertices (corresponding to operators in Y ).A layer tree is a rooted tree, each node of it is (l, O, r), where l is a layer; O is one of the operators r is the reference to the corresponding vertex of the parent (it does not exist for the root).The root of the layer tree corresponds to G in X. Vertices and inputs of one vertex that have the same type are enumerated 22 .The contribution of the layer tree (without coefficient) is the product of the node values 23 , where the value of the node (l, O, r) is OΓ l , where Γ l is the Feynman amplitude corresponding to the layer l.For simplicity, we will sometimes interchange the layer tree nodes with the layers. 22The enumeration is needed to avoid symmetry problems with 4-photon vertices.If we have an enumeration, the coefficients should be properly accounted for; this can be done with a standard technique; we do not do this in our consideration.An examination of the coefficients is only necessary for symmetry issues; no reduction of similar terms takes place in our consideration. 23Strictly speaking, tensor contractions are needed when we have U ′ 0 layers, because for simplicity we decided to use one label for all kinds of special vertices that can be generated.For example, in the definition ( 9), (10) for photon self-energy graphs, U ′ 0 is a vector.
For example, the term A G (U 0 ) def ghi (U 2 ) gh for the graph G from Fig. 8,left gives the layer tree depicted in Fig. 8,right.
FIG. 8.An example of a Feynman graph (left) and a layer tree (right) that is possible for it (with U ′ 0 and U ′ 2 for lepton self-energy graphs) The following markers are used for vertices on images: 1. a round dot is an ordinary γ ν vertex; 2. a boxed dot is a mass vertex (which gives 1); 3. a cross-dot is a ( / p − m)-vertex; 4. a cross in a circle is a vertex coming from photon self energy subgraphs; 5. a diamond dot is a vertex originating from photon-photon scattering subgraphs.

Another example is the term
for G from Fig. 9,left, where M ′ parts are taken for each B (the term is part of the direct on-shell renormalization expression).Its layer tree is depicted in Fig. 9,right.The root is layer 1.
The sequence of layer tree nodes Y 1 , . . ., Y n of vertexlike type is called the main branch if the following conditions are satisfied: • Y 1 is the root of the layer tree; • Y j+1 is the child of Y j corresponding to the vertex of Y j joining its external photon; • Y n has no vertexlike children corresponding to its vertex joining its external photon; the type of a layer is its multiset of the external line types. ( FIG. 9.An example of a Feynman graph (left) and a possible layer tree for it (right) with M ′ -operators We say that the node X of a layer tree is locally on a path, if the corresponding to X parent's vertex lies on parent's main path; we also assume that the root of the layer tree is locally on a path.If the node is not locally on a path, it may be on a loop or on a special 2-photon or 4-photon vertex of the parent.
The maximal initial segment Y 1 , Y 2 , . . ., Y k of the main branch Y 1 , Y 2 , . . ., Y n , all elements of that are locally on a path, is called the I-branch.
Let us proceed with the proof.We represent the contribution of K obtained by our method as the sum of the layer tree contributions (with coefficients).Each layer tree has one node X on the I-branch that has the operator A ′ .We call it the A-node.The operators corresponding to vertexlike or lepton self-energy nodes X satisfy the following rules: • if X is the A-node, then the corresponding operator is A ′ ; • if X is the root but not A-node, then the operator is • if X is an ascendant of the A-node but not the root, then the operator is L; • if X is a descendant of the A-node and belongs to the I-branch, then the operator is U ξ , where (15) is satisfied for graphs G in K; • if X is not on the I-branch and all ascendants of X and X itself are locally on a path, then the operator is • if X is not on the I-branch and at least one ascendant of X or X itself is not locally on a path, then the operator is U ′ 2 or M ′ .Let us perform the following operations with this layer tree sum: 1. Remove all terms containing layers of photon-photon scattering type.In fact, they are cancelled by the identities for individual photon-photon scattering graphs of the form Γ µ1µ2µ3µ4 (0, 0, 0, 0) = 0, where Γ is the corresponding Feynman amplitude, and the external momenta are in the parentheses.The identity is satisfied for a given set of graphs, if the set is closed under the movement of external photons along lepton loops, and every graph in this set contains at least one external photon line joining a lepton loop (we consider graphs with special vertices; thus an external photon can fall into a 4-photon vertex).The identities can be proved with a standard technique that is used in QED 24 .Let us consider the elimination in detail.We say that the layer trees T 1 and T 2 are equivalent if one is obtained from the other by the movement of external photons along lepton loops in some layers of photon-photon scattering types which have no descendants of photon-photon scattering type (the absence of the descendants guarantees that all moved photons are on lepton loops 25 ); we suppose that the structure of the descendants is preserved when a photon is shifted; the children referring to the vertex joining the moved photon change their reference to the new photon location on a lepton loop.It is easy to see that if T 1 is obtained from a Feynman graph of K, this is also true for T 2 .Therefore, all terms are split into equivalence classes.In the contribution of each of these classes, the identities mentioned above are factorized.This leaves only the classes that have no photon-photon scattering layers at all.Note that the movement of photons (both external and internal) along lepton loops does not change the 1-particle irreducibility.
2. Remove all terms containing vertexlike layers with a non-A ′ operator and the external photon on a loop 26 .The operator is All these operators yield 0 for the Feynman amplitudes Γ µ (p, q) satisfying Γ µ (p, 0) = 0. Thus all these terms are cancelled by the identities of this form mentioned in Section IV B. This can be proved in a similar way as for the case with photon-photon scattering layers.After this step, all occurrences of U ′ 3 are removed, since at least the U ′ 3 corresponding to the end of the I-branch is applied to a layer with the external photon on a lepton loop.
3. Represent the layer trees as trees of M-blocks.An M-block includes a node (vertexlike or lepton self-energy-like) with an operator A ′ , L ′ , U ′ j (j = 1, 2) of the layer tree, all its children with operators M ′ , U ′ 0 , their children with these operators and so on 27 .For example, there are two M-blocks in 9,right: {1, 2, 3, 5} and {4, 6}; the latter is a child of the former and refers to the vertex of layer 2 to which layer 4 refers.M-blocks can also be represented as layers.The layer is obtained from the corresponding subgraph by shrinking the subgraphs corresponding to its children (in the tree of M-blocks) to special vertices (in a sense, it is obtained by gluing the layer tree layers belonging to the M-block).An M-block can also be viewed as a tree of nodes.Let us define the main vertex of the M-block; if the layer has its external photon on its main path, the main vertex is layer's vertex joining its external photon; otherwise, it is the last vertex of the main path of the layer.4. Single out the S-branch of M-blocks in each layer tree.We define the S-branch of M-blocks Y ′ 1 , . . ., Y ′ n in a tree of M-blocks by the following conditions:

5.
For each M-block take a one-to-one correspondence between the layer vertices (except the main vertex) and its Σ-chains lying on the same lepton path part or loop.A Σ-chain is a peace of a lepton path or loop, which ends are ordinary γ µ vertices, but all intermediate vertices are of ( / p − m) type (the situation when there are no intermediate vertices is also possible).We require that the Σ-chain lies on the same side of the M-block external photon as the corresponding vertex (if it is applicable).We should take this correspondence deterministically (it should depend only on the topology of the Σ-chains and external lines).For example, for each v on a lepton loop, we can take the outgoing from v chain; for v on the main path we can take the Σ-chain joining v and having another end closer to the main vertex of the M-block.
6. Remove all terms containing M-blocks outside the S-branch.Let us explain how they are cancelled.An M-block is called factorizable if it has at least one child that does not refer to its main vertex.We say that the M-block trees T 1 and T 2 are equivalent if one is obtained from the other by a sequence of the following transformations: • suppose Y ′ is a factorizable M-block that has no factorizable children, v is an ordinary γ µ -vertex in Y ′ , the main vertex of Y ′ is not equal v, the Σ-chain σ corresponds to v, the child M-block referring to v exists and is denoted by Y ′′ ; the transformation is: remove the external photon in the root layer of Y ′′ , move the reference of Y ′′ to the new ( / p − m) vertex inserted at the beginning of σ; the operator applied is U ′ j , j = 1, 2 They cannot fall to 2-photon vertices due to 1-particle irreducibility.
We consider here only the internal structure of the layer, not its reference to its parent.We do not use here that M ′ properly extracts the mass part, but it is used for the proof of gauge invariance [8].
and remains the same after the transformation; the structure of the descendants and their references are preserved, the reference to the main vertex of Y ′′ is moved to the main vertex of the transformed M-block (there are no references to non-main vertices, since Y ′ has no factorizable children).
• if under the same conditions there are no children referring to v, and the intermediate vertexes of σ are w 1 , . . ., w r , r ≥ 1 (ordered along the line direction), Y ′′ is the child M-block referring to w 1 , then insert an external photon into an arbitrary line on the main path of the root layer of Y ′′ , remove w 1 and move the reference of Y ′′ to v. the applied operator is U ′ j , j = 1, 2 and remains the same after the transformation; the structure of the descendants and their references are preserved (as in the previous case).
These transformations are inverse to each other.They are illustrated in Fig. 10; here v is the round dot, σ starts from v, T 0 is a subtree referring to v in the layer tree, the subtrees T 1 , . . ., T n refer to the intermediate vertexes of σ; on the right side, T 0 is replaced by T ′ 0 (by removing the external photon in the root layer) and is connected to the begin of σ.This equivalence relation splits the terms into equivalence classes.If a term belongs to the expression, all equivalent terms also belong to the expression; it is important here that these transformations do not change the 1-particle irreducibility of the whole graph as well as of the subgraphs to which the operators are applied 28 ; the absence of odd lepton loops in the whole graph is also preserved.In the sum of contributions belonging to one equivalence class, the multipliers of the form U ′ j Γ + U ′ j Σ (as mentioned in Section IV B) are factorized.If the number of these multipliers is at least 1, the contribution equals 0.
The presence of M-blocks outside the S-branch means that there is at least one factorizable M-block in the S-branch.If it has factorizable children, take one of them; and repeat the operation until it has no factorizable children.So there is at least one factorizable M-block that has no factorizable children.The cancellation of these terms follows from this.
Note: On the one hand, the correspondence between v and σ may cross the boundaries of the layers constituting the M-block; on the other hand, the procedure of inserting and removing the external photon is always performed in the root layer of the M-block, and the structure of Σ-chains is ignored in this layer (the main path may contain special vertices); but when we move the reference, we use the main vertex of the whole M-block.

7.
Combine the terms with different placements of the A-nodes.After the previous step, the contribution can be written as a sum of terms like where Γ • Y ′ j are correct vertexlike M-blocks (taking into account their internal layer structure); • G ∈ K, where G is reconstructed by the procedure described above; , where x, . . ., x may have 0 elements; • for all i such that O ′ i = A ′ the layer Y ′ i has its external photon on its main path.Any permutation of Y ′ 1 , . . ., Y ′ n does not change the validity of G ∈ K for the reconstructed graph G29 .Thus, the sum can be represented as a sum of terms ( 16) with the same conditions, but with vectors ).We can prove by induction that Thus, we arrive at the sum of terms with the same conditions.
With the contribution of K obtained by the in-place on-shell renormalization 30 we perform the same operations, but without the last step and using the identities for L ′ and B ′ .These operations result in the same sum of terms (17) with the same conditions.

V. COMPUTATIONAL RESULTS
The new method was tested by numerical calculations.The contributions of 25 gauge-invariant classes contributing to A for the muon were presented in [70].Here we present all 4-loop results for the electron and the muon for all gauge-invariant classes (with electron, muon and tau-lepton loops).They are summarized in Table II.We used the values m µ /m e = 206.76828103,m τ /m µ = 16.81665.in our calculations.The uncertainty of the masses was not taken into account.The aim of the calculation was to verify the method, not to obtain precise physical results (the mass uncertainty does not play a significant role at this level of precision).Some of the previously known values for comparison that are given in the tables were calculated with other values of the mass ratios (and the uncertainties may be inaccurate).The Monte Carlo integration was performed on the ITP/TTP KIT computing cluster with the help of the GPU NVidia A100 and took about three GPU-weeks; a method similar to that described in [32,35] was used31 .Table III presents the computed contributions of the gauge-invariant classes from Fig. 11 to 1 and a comparison with S. Laporta's results [29].Table IV contains all the contributions of the classes from Fig. 12 for different particles x and y.Since the contribution does not depend on the line directions, all graphs on the images are undirected.Also, permutting the elements on a photon does not change the contribution, but we calculated these contributions separately, except A Throughout the calculations, the operators (U j Γ) µ (p, q) = a(−m 2 )γ µ , (U j Σ)(p) = s(−m 2 )( / p − m) + r(m 2 ) + s(m 2 )m are used for j = 1, 2, 3 and all leptons, where (8) and ( 13) are satisfied; therefore, U -subtractions are performed at space-like momenta (p 2 < 0); this is a general observation: the subtractions at space-like points make the oscillations slightly smaller and the convergence of the Monte Carlo integration slightly faster [71].
For photon self-energy subgraphs, the replacement where (11) is implied, was used instead of subtractions.It makes the integrand simpler and its evaluation faster.This approach was used earlier in many calculations; see, for example, [72].Known lower-order analytical expressions for the polarization operator were not used; the reason is the attempt to make everything simple for realization; in addition, the usage of known analytical formulas can improve the calculation speed only for those contributions that already require not too large numbers of the Monte Carlo samples to reach the needed accuracy; this would not affect the whole calculation speed significantly.Moreover, if the number of self-energy subgraphs is small, the "direct" realization does not lead to a significant increase in the complexity; this significantly affects only the number of the integration variables (which is not a problem for the Monte Carlo integration, provided that a good probability density function has been chosen).

(2n) 3 (
m e /m µ , m e /m τ ), 195 ± 0.026 for those times was obtained in 1974 by T. Kinoshita and P. Cvitanović; the uncertainty is caused by the statistical error of the Monte Carlo integration.At the same time, an analytical calculation of A energy or photon-photon scattering subgraph; here ξ = 3, if the vertex incident to the external photon of the whole graph G lies on a lepton loop of G, 1 otherwise; 1 a 2 b 1 b 2 c 1 c 2 c 3 c 4 d 1 d 2 d 3 e 1 e 2 e 3 e 4 e 5 , G d = {aa 1 a 2 b 1 b 2 c 1 c 2 c 3 c 4 d 1 d 2 d 3 }, G c = aa 1 a 2 b 1 b 2 c 1 c 2 c 3 c 4 , I[G]= {G e , G}, subdiagrams are denoted by enumeration of their internal vertices; we should expand the parentheses to obtain the forest formula.The expression for this example does not depend on the type of leptons on the lepton loops (U 2 may be defined differently for different types of leptons, however).

FIG. 1 .
FIG. 1.An example of a Feynman diagram contributing to AMM.

FIG. 2 .
FIG. 2. Another example of a Feynman diagram contributing to AMM.
are the UV divergence k 2 → ∞ and the IR divergence k 1 → 0. The first is cancelled in the sum of two terms by 1 − U 1 applied to the amplitude of abc and in the last term by A abc .The second does not exist in A G (U 1 ) abc , in A G it enhances the amplitude generated by A abc by an IR-infinite multiplier, and it is cancelled by (L − U 1 ) G A abc , because L − U 1 completely extracts the IR-divergent part of the enhancing multiplier.Let us examine the graph G on Fig.3,right.The expression is

2 2 +
(terms with light−by−light operators) e e e is a child of Y ′ j−1 and refers to the main vertex of Y ′ j−1 ; • there are no children of Y ′ n that refer to the main vertex of Y ′ n .An initial segment Y ′ 1 , . . ., Y ′ k of the S-branch corresponds to the I-branch Y 1 , . . ., Y k .Due to the 1-particle irreducibility of each subgraph to which an operator is applied, all elements of the S-branch are of vertexlike type.

FIG. 10 .
FIG.10.Transformations leading to the cancellation of all terms with M-blocks outside the S-branch.

3 [
II(c2,ty)].Tables V and VI contain a comparison with known values for A ], respectively.Table VII contains the contributions of the classes from Fig.13to A(8) 3 for electron and muon, Table VIII contains their comparison with known results.

TABLE I .
Contributions with U -operators of the graphs from Fig.5obtained with our method (16)..., Γ n are the vertexlike Feynman amplitudes (after the appropriate application of −M ′ and −U ′ ′ j is inserted into the vertex joining the external photon; otherwise, it is inserted into the end of the main path.The term(16)corresponding to Y ′ 1 , . . ., Y ′ n , O ′ 1 , . . ., O ′ n exists in the expression for K if the following conditions are satisfied:

TABLE II .
The calculated total 4-loop contributions (for the electron and the muon) and their comparison with previously known values; the values for comparison are from the first reference in the lists.

TABLE V .
Comparison with previously known values for A of the electron (x = e); the values for comparison are from the first reference in the lists.Class Value × 10 3 , y = µ Value for comparison Value × 10 5 , y = τ Value for comparison Ref.