Nonlinear trident in the high-energy limit: Nonlocality, Coulomb field and resummations

We study nonlinear trident in laser pulses in the high-energy limit, where the initial electron experiences, in its rest frame, an electromagnetic field strength above Schwinger's critical field. At lower energies the dominant contribution comes from the"two-step"part, but in the high-energy limit the dominant contribution comes instead from the one-step term. We obtain new approximations that explain the relation between the high-energy limit of trident and pair production by a Coulomb field, as well as the role of the Weizs\"acker-Williams approximation and why it does not agree with the high-$\chi$ limit of the locally-constant-field approximation. We also show that the next-to-leading order in the large-$a_0$ expansion is, in the high-energy limit, nonlocal and is numerically very important even for quite large $a_0$. We show that the small-$a_0$ perturbation series has a finite radius of convergence, but using Pad\'e-conformal methods we obtain resummations that go beyond the radius of convergence and have a large numerical overlap with the large-$a_0$ approximation. We use Borel-Pad\'e-conformal methods to resum the small-$\chi$ expansion and obtain a high precision up to very large $\chi$. We also use newer resummation methods based on hypergeometric/Meijer-G and confluent hypergeometric functions.


I. INTRODUCTION
Quantum electrodynamics in strong laser fields is usually studied by treating the interaction with the quantized photon field in a standard perturbation expansion in α = e 2 /(4π), but with a Volkov/Furry picture treatment of the strong field. The strength of the field is usually described in terms of the "classical nonlinearity parameter" a 0 = E/ω 1 , where E is the field strength and ω a typical frequency scale of the in general pulsed plane wave. For a 0 > 1 one can in general not treat the field in a perturbation expansion in a 0 . However, if a 0 1 one can make an expansion in 1/a 0 [1,2], which corresponds to an approximation where the inhomogeneous field is treated as being locally constant. This is a very useful approximation that allows otherwise very complicated processes to be studied. However, a 0 is not the only parameter in the system and so how large a 0 has to be for this locally-constant-field (LCF) approximation to be valid depends on the momenta of the particles involved [3][4][5][6].
A plane wave on its own cannot produce any particles, but a single particle entering a plane wave can, and at high intensity such a seed particle can lead to the production of a large number of particles in cascades [7][8][9]. So, consider a single particle with momentum p µ that enters the strong field. In a plane-wave field, the integrated/total probability that this particle decays/produces some other particles (e.g. an electron * g.torgrimsson@hzdr.de 1 We use units with electron mass me = 1, and a factor of the charge e has been absorbed into the field strength eE → E. and a positron in trident) only depends on a 0 and a second parameter b 0 = kp, where k µ is the null wave vector of the plane wave (k 2 = 0 and k 0 = ω). The coefficients in the LCF expansion in 1/a 0 1 only depend on b 0 via the "quantum nonlinearity parameter" χ = a 0 b 0 . (Note that χ is independent of ω, so the LCF expansion can be seen as a derivative expansion.) The limit where χ becomes very large, that is αχ 2/3 1, is different from what one might expect from the highenergy limit of QED without a strong field, and it has been conjectured that the expansion in α might even break down in this regime [10][11][12][13][14][15][16][17]. This would then be a regime where neither the strong field nor the α dependence can be treated with perturbation theory, i.e. QED would be truly strongly coupled. This conjecture is an old result which has attracted a great deal of interest in the last couple of years [18][19][20][21][22].
It has recently been shown [5,6] that whether the large χ limit is reached by having a 0 or b 0 being the largest parameter leads to fundamentally different results. If a 0 is the largest parameter then LCF is good, but if b 0 is largest then the probabilities of nonlinear Compton scattering and Breit-Wheeler pair production reduce to the leading perturbative results, i.e. they become proportional to a 2 0 , and then one has ordinary high-energy scalings without the suggestion of an α expansion break down.
In this paper we study and compare these two ways of reaching high χ for the trident process [1,[23][24][25][26][27][28][29][30][31][32][33][34], e − → e − + e − + e + . One motivation for this is that for αχ 2/3 1 one is interested in whether the α expansion breaks down, so it is natural to ask how the results in [5,6] generalize to higher-order processes, and the trident process is such an example. The trident probability is of course not the next-to-leading order correction to be added to the nonlinear Compton or Breit-Wheeler probabilities, but it does correspond to the imaginary part of a loop that is part of the α expansion of the e − → e − amplitude (the mass operator). Moreover, for constant fields this loop gives the dominant contribution at O(α 2 ) (the loops giving e.g. double nonlinear Compton scattering [13,[35][36][37][38][39][40][41] is subdominant), see the review [18] for a collection of the various loops that have so far been calculated.
Another motivation for this study is phenomenological. Part of the trident process was observed at SLAC two decades ago [25]. Since then there have basically not been any new experiments. But there are now definite plans for new trident experiments at e.g. LUXE [42,43] and FACET-II [44,45]. In the old SLAC experiment the laser was relatively weak, i.e. a 0 < 1, and, due to the lack of complete theoretical predictions, a Weizsäcker-Williams (WW) approximation was used to estimate the trident probability. However, the new experiments will have larger a 0 and it has been shown [28] that this WW approximation does not give the same result as the LCF approximation. One might still want to use an approximation such as WW, because LCF only works when a 0 is sufficiently large, and so it is not clear how good LCF is for 1 < a 0 < 10, which is a regime that is relevant for upcoming experiments. WW on the other hand is associated with high energy rather than high intensity, so one might expect that WW could be used in regimes where LCF is not good. In this paper we show that there is indeed a regime where the WW approximation is good.
We also compare the high-energy limit of trident with pair production by a Coulomb field in a plane wave. To do this we generalize our results in [1] to a process where the initial electron is replaced by another particle, e.g. a muon, which has the same charge but different mass.
Studying trident is also relevant as the first step in cascades, i.e. the production of a large number of particles. For trident one can compare approximations with the exact result as a way of determining in what parameter regimes similar approximations can be used for higherorder processes, for which one cannot compare with the exact result. In this context one may ask how the momentum of the initial particle is distributed among the produced particles. In the emission of a single photon by an electron, the probability has in the high-energy limit a peak where the emitted photon takes away almost all of electron's momentum [46,47]. In contrast, for trident, where the emitted photon decays into a pair, one finds that the probability is largest when the initial electron keeps almost all of its momentum and only gives a small fraction to the emitted photon and the produced pair. This low-momentum transfer has an important impact on the behavior of the high-energy limit of trident compared to the first-order processes [5,6].
With this study we are also mapping a part of the parameter space not considered in previous literature. At small a 0 the probability is perturbative and scales to leading order as O(a 2 0 ), which is a regime that has been studied since the 40's [48]. For large a 0 the leading order scales as O(a 2 0 ) [23,24], while the full next-to-leading order O(a 0 ) was only calculated recently [1,31]. In [1] we also considered the low-energy regime and obtained explicit analytical expressions valid for arbitrary a 0 1. In this paper we complement these previous studies by providing new analytical results in the high-energy limit, for arbitrary a 0 .
The rest of this paper is organized as follows. In Sec. II we provide some basic definitions and the generalization of some of our results in [1] to muon trident. In Sec. III A we consider the large-χ limit of the LCF approximation, i.e. we first take a 0 to be the largest parameter and then we take χ large. In Sec. III B we study the limit where b 0 is the largest parameter. We show that having a 0 largest and then b 0 large does not commute with having b 0 largest and then a 0 large. In Sec III C we compare our new high-energy approximation with pair production by a Coulomb field. In Sec. III D we study the applicability of the WW approximation. In Sec. III E and III F we study the next-to-leading order corrections in the largea 0 expansion of the high-energy approximation, and show that these corrections are nonlocal and numerically important. In Sec. III G we study the perturbation series around a 0 = 0. We show that there is a finite radius of convergence. Using Padé approximants and a conformal map we find that the coefficients in the perturbation series can be used to obtain a good approximation beyond the radius of convergence and even for large a 0 . In Sec. IV we compare with the low-energy regime in the case where the initial particle is much heavier than the produced pair. In Sec. V we resum the divergent smallχ expansion and obtain a resummation that has a high precision up to very large χ. We conclude in Sec. VI.

II. DEFINITIONS
In terms of the vector potential the field is given by a ⊥ (φ), a 0 = a 3 = 0, where φ = kx = ωx + . In order to understand the high-energy limit of the trident process, it is useful to consider the process where the initial electron is replaced by a particle with a different mass but with the same charge. This could for example be muon trident [23,24,49] µ − → µ − + e − + e + , but we will keep the mass µ of the initial particle arbitrary because we are also interested in comparing with the infinite mass limit. When the initial particle is an electron then we have two identical particles in the final state, so there are two terms on the amplitude level M = M 12 − M 21 , where one M 12 is obtained from the other M 21 by swapping these identical particles. We refer to the cross term 2ReM * 21 M 12 between these two terms as the exchange part of the probability P = P dir +P ex . The exchange term is the most difficult to calculate. Indeed, it was for a long time omitted in the literature, even for the simplest case of constant fields. We have shown though that e.g. for short pulses and moderately high intensity (a 0 ∼ 1) the exchange term is important. The complicated exchange term is of course absent if the initial particle is not an electron. If the initial particle is an electron then the exchange term becomes small compared to the direct part of the probability at high enough energies.
The generalization (of the direct terms) to an arbitrary mass µ is obtained in the same way as in [1], we only have to replace the electron mass (which is m e = 1 in our units) with the muon mass µ in some places. When comparing with [23,24] it is important to note that we still use units with m e = 1, so for an incoming muon we have µ ≈ 207. In [1] we had two identical particles in the final state and therefore had to divide the probability by a factor of 2 to avoid double-counting when summing over momenta and spin. Since we do not have identical particles here, we have an overall factor of 2 compared to [1]. In the identical-particle case one has two contributions to the probability, |M (s 1 , s 2 )| 2 + |M (s 2 , s 1 )| 2 , which give the same contribution to the total/integrated probability. So, for the total probability, in the (mathematical) limit µ → 1 our results here reduce to the direct terms in [1] with the same factors of 2 in the prefactor. As in [1,50] we integrate over the transverse momenta of all the final state particles. The longitudinal-momentum spectrum P(s) is defined via These longitudinal-momentum variables are the ratios s i = kp i /kp, s 3 = 1 − s 1 − s 2 , and we use b 0 = kp for the initial particle. We use q 1 = 1 − s 1 for the longitudinal momentum of the intermediate photon. In our approach we find it convenient to separate the total probability into three terms, which have different number of lightfront time integrals.
There is nothing fundamental about this particular separation. In fact, for a constant field, or for large a 0 , P 22 gives one term that is quadratic in the volume and another term that is linear in the volume, where the latter should then be combined with P 11 and P 12 , which are also linear in the volume. Since the calculation is basically the same as in [1], we simply state the final results, which are valid for arbitrary field shape and polarization. The simplest term comes from the absolute square of a "lightfront-timeinstantaneous" term on the amplitude level where M e and M µ are the effective mass [51] for the electron and muon, respectively, where We have inserted factors of s 0 = 1 in appropriate places to make symmetries clearer. As in other processes we have considered, we always find that the exponential part of the integrands can be expressed in terms of the effective mass. The cross term between the "lightfrontinstantaneous" and the "three-point-vertex" parts of the amplitude is give by where D 12 = ∆ 12 ·∆ 32 and The third and final term is given by where κ ij = (s i /s j ) + (s j /s i ), 43 . In (8) and in the following we have left the i prescription implicit. The singularities at θ ij = 0 are always avoided with an integration contour equivalent to replacing φ 2,4 → φ 2,4 + i /2 and φ 1,3 → φ 1,3 − i /2 with > 0. Note that P 12 (s) is anti-symmetric with respect to s 2 ↔ s 3 , so for the integrated probability we find P 12 = 0. Thus, for the integrated probability and for µ = 1, we just have two terms, P 11 and P 22 . P 11 is almost as simple as a first-order process, and P 22 can be obtained from the incoherent product of the two first-order processes nonlinear Compton scattering by a "muon" and nonlinear Breit-Wheeler electron-positron pair production. In some regimes, thought, it is natural to split P 22 = P 22→1 + P 22→2 into two terms, where P 22→1 and P 22→2 scale linearly and quadratically in the volume, respectively. This can be done by splitting the step func-tion combination in (8) as [1] θ(θ 42 )θ(θ 31 ) In the first term the average lightfront time in the pairproduction step σ 43 = (φ 4 + φ 3 )/2 can be much later than σ 21 = (φ 2 + φ 1 )/2 for the photon-emission step, e.g. the photon can be emitted at one field maximum and then propagate to some later field maximum before it decays. In the second term σ 43 and σ 21 are forced to be close. So, the first term gives P 22→2 and the second term P 22→1 . One can show that for a 0 1 we have , so this is a natural separation at least for large a 0 . We therefore define The two-step P two gives the dominant contribution for high-intensity a 0 1 or for a long pulse length. This two-step dominance at a 0 1 is the basic assumption in particle-in-cell codes. In this paper we are interested in the high-energy limit, where the dominant contribution instead comes from the one-step P one .

III. HIGH ENERGY LIMIT
In this section we will study the limit where b 0 is the largest parameter in the system. In [34] we showed numerically that the direct part of the one-step becomes dominant in this regime. In this section we will derive analytical approximations for this case. In the following two subsections we will for simplicity set µ = 1. We will reinstate µ in Sec. III C.
A. High-χ limit of LCF For comparison, we first consider the large-χ limit of the familiar LCF. LCF can be obtained by starting with our expressions that are valid for arbitrary field shapes, and then expanding them in a power series in 1/a 0 , which is small in the LCF regime, see [1]. We have P two = a 2 0 P 2 + O(a 0 0 ) and P one ≈ a 0 P 1 . P 2 and P 1 depend on b 0 only via χ = a 0 b 0 . For large χ we can neglect the exchange part of P 1 and we find and where γ E ≈ 0.577, χ(σ) = a 0 b 0 |f (σ)| and the potential is given by a(σ) = a 0 f (σ). These are simple generalizations of the constant-crossed-field case, which was derived in [24], to slowly varying, locally-constant fields. If we keep a 0 constant and increase b 0 then P one ∼ a 0 ln χ and P two ∼ a 2 0 (1/χ 2/3 ) ln χ or which means that eventually (the right-hand-side of) (11) becomes larger than (the right-hand-side of) (12), suggesting that P one becomes larger than P two at sufficiently high energies. However, the LCF approximation breaks down at very high energies: One can obtain the LCF expansion by rescaling θ 21 →θ 21 /a 0 and θ 43 →θ 43 /a 0 , and ϕ = σ 43 − σ 21 →φ/a 0 for P one , and then expanding the resulting integrands in 1/a 0 . In deriving the highχ limit (11) and (12) one finds that only a small fraction of the initial longitudinal momentum is given to the electron-positron pair, more precisely s 2,3 ∼ 1/χ. (Contrast this with the large χ limit of single-photon emission without subsequent pair production, where there instead is a peak where the emitted photon takes almost all the energy from the electron [46,47].) One also finds that θ 21 ∼ χ 2/3 . So, to be sure that the LCF expansion is still valid, we need which means that for a given a 0 one cannot take χ arbitrarily large. Note that this implies which is a more precise, process-and regime-specific condition compared to the general rule-of-thumb a 2 0 b 0 [3]. As an aside we note that at low b 0 we find in general b 0 1 : where f (a 0 ) depends on the field shape. For a 0 1 we have b 0 1 a 0 1 : where the factor of 16/3χ gives the LCF result and c is a numerical factor. So, for LCF to be good in this regime one needs 2 Illustration of the fact that in the high-energy limit all the field dependence comes from the pair-production step. Double lines represent fermions dressed by the background field and the single line is an electron without interaction with the field. The wiggly line is a photon that is not part of the background field.
The condition (14) means that one cannot trust (13) when this ratio becomes larger than one. This is after all what one can expect for the relation between the first (P two ) and second term (P one ) in a (Laurent) power series. However, although the LCF approximation breaks down, we have found that if one increases b 0 with a 0 kept constant, then P one does in fact become larger than P two (but P one is no longer given by (11)). That P one can dominate in certain parameter regimes is not surprising, because for a 0 1 we have P one ∼ O(a 2 0 ) but P two ∼ O(a 4 0 ), so P one dominates for sufficiently weak fields. Moreover, the large-b 0 limit of the probabilities of the O(α) processes nonlinear Compton and Breit-Wheeler [5,6], suggests that, at constant a 0 (but not necessarily small), increasing b 0 takes us closer to perturbative physics. In the following we will show that this is partly true for trident, but, due to the low momentum transfer, trident has a more nontrivial dependence on a 0 .

B. Large b0 limit
So, we now take b 0 to be the largest parameter, and we therefore leave the LCF regime. In this limit it turns out that the spectrum is peaked at 1 − s 1 ∼ 1/b 0 1. This means that the initial electron keeps most of its longitudinal momentum, and the intermediate photon (in the P 22→1 case) scales as kl In the exponent of the direct terms we have factors of r 1 /b 0 ∼ 1/b 2 0 , which suggests a rescaling of θ → b 2 0 θ similar to [6] for single nonlinear Compton scattering. However, for the second step we have r 2 /b 0 ∼ b 0 0 , which leads to a fundamental difference. For P 11 dir we should not rescale θ with b 0 . In the limit of large b 0 we can still perform the longitudinal momentum integrals. We first change variables from s 2 = q 1 (1 + ν)/2 to ν and from s 1 = 1/(1 + t) to t. Then we rescale t → t/b 0 , expand the integrand in b 0 and perform the resulting t and ν integrals. We find with an integration contour equivalent to θ → θ +i . The second leading-order term comes from P 22→1 dir . To calculate this term we start by making a partial integration in θ 21 to change 1/θ 2 21 into 1/θ 21 . In the non-boundary terms we rescale θ 21 → b 2 0 θ 21 and σ 21 − σ 43 → b 2 0 σ 21 . To leading order this means that there is no field dependence for the θ 21 and σ 21 integrals, i.e. we can put e.g. D 1 → 0. The σ 21 integral is trivial and gives a factor of |θ 21 |, which we represent as Since Θ 21 → b 2 0 θ 21 to leading order, the resulting θ 21 integral can now be performed with the residue theorem. Then we perform the r integral and finally the longitudinal momentum integrals (t and ν). The boundary term coming from the partial integration in θ 21 is nonzero. To calculate this term we change variable from σ 21 to and make a partial integration in ϕ. The boundary term and the new ϕ integral (in which we rescale ϕ → b 2 0 ϕ) can now be expanded in b 0 . The longitudinal momentum integrals are again elementary. We thus find where γ E = 0.577... is the Euler constant. We have used . Using this relation again we find that the total probability P ≈ P 11 dir + P 22→1 dir is given by Note that, unlike the probabilities for the first-order processes nonlinear Compton and Breit-Wheeler [5,6], P does not scale as a 2 0 ; it has instead a nontrivial dependence on a 0 3 . The reason for this is that, while the incoming particle in the first step has high energy (which leads to perturbative scalings), the particles involved in the second step do not. This suggests that higher-order diagrams will in general have subprocesses associated with lower energy which give more nontrivial dependencies on the field strength. The appearance of softer (χ ∼ 1) vertices is of course also what one would expect for late vertices in cascades, after the initial momentum has been distributed among a large number of particles. However, here we see that this happens already at O(α 2 ).
Although (23) has in general a nonlinear dependence on a 0 , the first step in the trident process is simple in a way similar to the first-order probabilities in [5,6]. In fact, the first step in trident has to leading order no dependence on the field, all the field dependence comes from the second step, see Fig. 1. This generalizes a corresponding result in perturbative O(a 2 0 ) trident [48,55], where a single photon (which would come from the background field in our case) is absorbed, and where to leading order this photon is absorbed only by the pair-production step. For small a 0 we can expand (23) to O(a 2 0 ) and compare with the old literature on perturbative trident. We find where the Fourier transform of the field is given by (P 11 dir contributes a factor of −6/109 ≈ −0.06 of the nonlog term in (24).) To compare the probability (24) with the cross section in the literature, we replace the Fourier transform a(w) → e µ 2πδ(w − w 0 )/ √ 2ωV 3 and divide by the flux density (1/V 3 ) and a temporal volume factor. We then recover exactly the literature result, see [48,55].
To compare with the LCF result (11), let us consider the limit a 0 1. One should not expect the a 0 1 limit of our high-b 0 approximation (23) to reduce to the large-χ limit of LCF (11), because taking b 0 to be largest and then a 0 to be large does not commute with taking a 0 to be largest and then b 0 to be large [5,6]. We find by taking the a 0 1 limit of (22) and (20) and The total probability is thus given by Although this is different from the χ 1 limit of the LCF approximation (11), it nevertheless looks quite similar. There is, however, an important difference.

C. Muon trident and pair production by a Coulomb field
This difference is not obvious in the above expressions, but it becomes obvious if we replace the initial electron with a muon (or some other lepton with a different mass) with mass µ = 1 (we still use units where the electron mass m e = 1). The LCF approximation (11) is independent of µ, see [24]. In contrast, the generalization of (23) is obtained in the same way as in the perturbative case, i.e. one should replace b 0 → b 0 /µ. This means that in the rest frame of the initial particle, the probability is independent of µ. This suggests that (28) can be directly compared with the probability of pair production by a plane-wave field and an infinitely massive initial particle in the form of a stationary Coulomb field. That process has been calculated in a constant-crossed field in [24,56]. We find that (28) agrees perfectly with Eq. (19) in [56] or with Eq. (45) in [24] (2ImT gives the pair-production probability). Thus, our new large-b 0 approximation interpolates between the old result for perturbative trident for a 0 1 and the old result for pair production by a Coulomb field in a constant-crossed field for a 0 1. This relation with pair production by a Coulomb field can also be seen in the perturbative case O(a 2 0 ) [48,57,58], so one can expect it to hold for arbitrary a 0 .
To show that this is indeed the case, we need to calculate the probability of pair production by a Coulomb field and an inhomogeneous plane wave with arbitrary a 0 . Formally, the calculation is similar to nonlinear Breit-Wheeler pair production, except that the photon polarization vector µ should be replaced by the Fourier transform of the Coulomb field A 0 (x) = e 4πr , which is given by A 0 (l) = e/l 2 , and the Coulomb photon is off shell. The amplitude is given by where ψ and ψ − are the Volkov solutions for the electron and positron (same notation as in [1]). We work in the rest frame of the initial particle (the Coulomb center), so b 0 gives the frequency of the plane wave. The integrals over x ⊥ and x − give a delta function which we use to perform the integrals over l ⊥ and l 3 = −2l − = 2l + . The probability is given by where dp = θ(p − )dp is Lorentz invariant, and p 2 and p 3 are the momenta of the electron and positron. We exponentiate the Coulomb factor and perform the resulting Gaussian integrals over p 2⊥ and p 3⊥ . We are using the Coulomb gauge for the Coulomb field, rather than the lightfront gauge, and we find terms that are conveniently rewritten using partial integration, using e.g.
Next we can perform the u integral in terms of an incomplete gamma function, This result (33) is exact in b 0 (and a 0 ). For large b 0 it gives logarithmic terms. The large b 0 limit is obtained by rescaling s 2,3 → s 2,3 /b 0 and expanding in b 0 . We change variables from s 2 = t(1 − ν)/2 and s 3 = t(1 + ν)/2 to t and ν. After performing these (elementary) longitudinal momentum integrals we finally find (23). Thus, our highenergy approximation (23) for trident agrees exactly with the high-energy limit of pair production by a Coulomb + plane-wave field, for arbitrary a 0 , field shape and polarization. Although in the near future it will probably be easier to reach high χ by increasing a 0 rather than b 0 , i.e. within the LCF regime, the high-χ limit of LCF does not agree with the result for Coulomb + constantcrossed field, this connection is instead seen in the high-b 0 limit (23). Pair production by the combination of a Coulomb field and inhomogeneous plane waves has been studied at high energies in [59][60][61][62][63]. For a comparison between muon trident and pair production by a Coulomb field in a plane wave see [49].

D. Weizsäcker-Williams equivalent photon approximation
For other processes, in the absence of a strong laser field, a common tool for studying the high-energy limit is the Weizsäcker-Williams (WW) equivalent photon approximation [64][65][66], see e.g. [67] for a textbook treatment.
At the time of the famous experiment at SLAC [25], no complete description of trident existed, so a WW approximation was used to estimate the importance of the one-step term 4 . However, in [28] it was shown that the WW approach does not agree with the high-χ limit of the LCF approximations. In this section we will explain why this is.
In our case the starting point for a WW approximation is given by (cf. [28]) where P BW is the photon-averaged probability of nonlinear Breit-Wheeler pair production, which can be expressed as [1] where again q 1 = s 2 + s 3 is the longitudinal photon momentum. We again rescale s 2,3 → s 2,3 /b 0 and change variables from s 2 = t(1 − ν)/2 and s 3 = t(1 + ν)/2 to t and ν. To leading (logarithmic) order we find which is exactly the same as the ln b 0 part of the full approximation (23). So, the WW approach does work. It agrees with our new approximation where b 0 is the largest parameter. Since the large a 0 limit of this approximation does not commute with the large b 0 limit of the LCF approximation, we now see that the reason that WW and LCF do not agree is that for WW to work we need b 0 to be the largest parameter, while for LCF to work we need a 0 to be largest. The WW approach might be the simplest way to obtain the ln b 0 term, but b 0 would have to be very large in order for the constant terms to be negligible compared to this logarithmic term.

E. Nonlocal corrections
In Sec. III C we showed that the large-a 0 limit of our large-b 0 approximation is fundamentally different from the large-χ limit of the LCF approximation, even though they at first sight look similar. This difference becomes even clearer at the next-to-leading order (NLO). The leading order (LO) (28) is obtained in a way that is similar to the derivation of LCF, i.e. it is obtained by rescaling θ →θ/a 0 and then expanding to leading order in 1/a 0 . This is a local, derivative expansion around the point where θ = 0 or φ 3 = φ 4 . For example, the effective mass becomes M 2 ≈ 1 +ȧ 2 ⊥ (σ)θ 2 /12 with corrections involving higher derivatives of a(σ). The correction to the leading-order LCF approximation (11) and (12) is obtained by simply including higher orders in this expansion. In contrast, we will now show that the nextto-leading order correction to (28) is nonlocal. In fact, (at least for a long pulse considered in this section) its scaling with respect to a 0 is not universal, it depends on the pulse shape. Comparison between the exact, the leading order (28), and the leading order plus the next-to-leading order correction (2). The field is monochromatic with a circular polarization, f = {sin φ, cos φ}. The first plot shows the part proportional to ln 2b0 and the second plot the rest. In both cases a factor of α 2 T has been factored out, where T is a volume factor. The nonlocal correction is clearly important here.
We can see this using a long pulse with circular polarization, a(φ) = a 0 (sin φ, cos φ)h(φ/T ), where h(x) gives the envelope shape, e.g. e −x 2 or θ(1−2|x|). For T 1 we rescale σ = T u and expand in T . (The locally monochromatic approximation has recently been studied in [68].) We have and We see that u only appears in the integrand via a 0 h(u). Let us for simplicity consider first a flat-top envelope h(x) = θ(1 − 2|x|), so the u integral gives trivially 1. We obtain NLO by subtracting from the exact integrand the integrand that gives LO (which is obtained by rescaling θ →θ/a 0 and expanding to leading order in 1/a 0 ), but expressed in terms of the original θ rather thanθ = a 0 θ, and then we expand this difference directly in 1/a 0 , i.e.  (46)) instead of a flat-top envelope. Since LO∼ a0 and NLO∼ √ a0 (apart from log terms), the absolute difference increases with a0, so NLO is even more important for this pulse shape. For the constant part the NNLO scales as a 0 0 .
without rescaling any integration variables. We find where and F LCF =ḟ 2 (σ)θ 2 /12. The integrand in (40) has an integrable singularity at θ = 0, so we can set i → 0. Note that, in contrast to LO (28) and what one might have expected from the LCF regime, this NLO depends nonlocally on the field, i.e. it is not an expansion around θ = 0 and the dominant contribution to the integral comes from a θ interval with θ ∼ 1 (neither large nor small, dimensionless). So, while we for LO (28) can perform the θ integral for an arbitrary field shape, in NLO we still have a nontrivial θ integral that feels all of the field shape. Since we are in a regime where b 0 is supposed to be larger than any other parameter, one might have expected that the formation length should be large and then the nonlocality would not be surprising. But note that the dependence on a 0 and b 0 in (23) is separated into h(a 0 ) ln b 0 + g(a 0 ), where h(a 0 ) and g(a 0 ) only depend on a 0 and the pulse shape. So, whether or not we can approximate the functions g(a 0 ) and h(a 0 ) using a local θ ∼ 1/a 0 scaling is not determined by b 0 . And if a 0 is sufficiently large (but with b 0 still being large enough such that (23) is valid) the leading order does still come from a short formation length θ ∼ 1/a 0 independently of b 0 . Note also that NLO scales as a 0 0 and ln a 0 compared LO which scales as a 0 and a 0 ln a 0 , so, unless a 0 is very large, (40) provides a numerically important correction. This is illustrated in Fig. 2. For this field we find where all the numerical factors are approximate. At least for this example, Fig. 2 shows that by including NLO we have a good approximation already at a 0 > 2.
If we instead of a flat-top envelope h(x) = θ(1 − 2|x|) have a smooth envelope, then the u integrand is approximately constant, equal to (40), in the interval where a 0 h(u) 1, but the fact that the length of this interval is now a 0 dependent means that NLO has a different scaling with respect to a 0 . Consider for example with n ≥ 1. In the limit n → ∞ we recover the flat-top h(x) = θ(1 − 2|x|). We obtain the NLO by subtracting the LO integrand, as in (40), except this time we rescale u → a 1/(2n) 0ũ before we take the limit a 0 1. We find that NLO scales as a 1/(2n) 0 (with some terms having an additional ln a 0 ). This means that for a smooth envelope NLO is even more important. It is most important for the field with the slowest decay, n = 1, where the ratio between LO and NLO only scale as √ a 0 . Note that LO is obtained from φ values on the order σ ∼ a 0 0 and θ ∼ 1/a 0 , while NLO is obtained from σ ∼ a 1/(2n) 0 and θ ∼ a 0 0 . Note also that the scaling with respect to a 0 is not universal, it depends on the pulse shape, n in this case. This also highlights the fact that NLO is not simply the next term in a power-series expansion in 1/a 0 (in contrast to the LCF regime). Fig. 3 shows that for a Lorentzian pulse (n = 1) shape the NLO term is indeed more important than for the flat-top envelope. This is especially clear for the b 0independent term, for which the error at leading order is even larger than the exact result, even for a 0 = 10. For the b 0 -independent term the NNLO term is a constant, a 0 0 . This NNLO term is obtained in the following way. Let I(σ, θ) be the integrand. LO is obtained by rescaling θ →θ/a 0 and expanding to leading order in a 0 . If one tried to obtain the next order by keeping σ andθ as independent of a 0 then one would find divergent integrals, so NLO must instead be obtained by a different a 0 scaling of the integration variables. Let I LO (σ, θ) be the leading-order integrand expressed in terms of the original θ variable. NLO is now obtained from I(σ, θ) − I LO (σ, θ) by rescaling σ → √ a 0σ and expanding to leading order in a 0 . Again, one cannot obtain NNLO with the same a 0 scaling of θ and σ, because this leads to divergent integrals. Let I NLO (σ, θ) be the integrand that gives NLO, expressed in terms of the original variables. NNLO is now obtained from (I − I LO − I LO )(σ, θ) by expanding to leading order in a 0 , this time without rescaling the integration variables. So, each of these terms are obtained with a different rescaling of σ and θ. Contrast this with the LCF or the saddle-point regime, where one just have to rescale the integration variables once and then obtain the leading as well as higher orders by simply expanding the integrand to higher orders. In the saddlepoint regime one would for example change variables to θ = θ saddle + √ χθ etc. and then expand the integrand in a power series in χ times and exponential on the form e −.../χ . This leads in general to an asymptotic series, but the terms are obtained in a systematic way. Here we need to work more in order to obtain the higher orders in the large-a 0 expansion, and, moreover, the scalings of NLO and NNLO are not universal, they depend on the field shape. We also see that, unless a 0 is very large, LO can be far from the exact result, which means that we need to obtain these higher orders. Fortunately, once we have calculated the first orders we obtain a very good approximation already at a 0 2.

F. Nonlocal corrections for short pulses
In the previous section we showed that NLO can be important for a long pulse. However, the longer the pulse is, the higher the energy has to be for our high-energy approximation (23) to good. So, in this section we will study NLO for short pulses. Since unipolar fields seem to involve more complicated calculations, we focus on a field with a(−∞) = a(∞), given by for |φ| < π and a(φ) = 0 for |φ| > π. As in the previous section, LO is obtained by expanding to leading order in θ, while the corrections are nonlocal. Due to symmetry we can restrict the integration variables to φ 1 < φ 2 . We separate the integration region into parts with P 12 = {φ 1 < −π, −π < φ 2 < π}, P 22 = {−π < φ 1 < φ 2 < π}, P 23 = {−π < φ 1 < π, φ 2 > π} and P 13 = {φ 1 < −π, φ 2 > π}. The contribution from P 23 is equal to the one from P 12 , and only P 22 contribute to LO. NLO can be calculated in a similar way as in the previous section, but the fact that we have two nontrivial integrals makes the calculations more complicated. So, we simply state the result where all the constants are approximate. While a direct calculation of this is quite involved, we can confirm it more easily by making the following ansatz for the correction, where d i are constants that can be obtained either by a numerical evaluation of the exact result for P, dP/da 0 , d 2 P/d 2 a 0 and d 3 P/d 3 a 0 at one, arbitrary, large a 0 = a r ; or by evaluating P at 4 different a r . a r should be large enough so that the exact result has converged to (49), and can be chosen much larger than the a 0 range one is mainly interested in. In this case we have checked that a r ∼ 10 4 gives good results: The need for NLO is most clearly seen in the b 0independent part. At a 0 = 30 the exact result for this part is ≈ −69.41α 2 , which is in good agreement with LO+NLO≈ −69.45α 2 , while the leading order is not great, LO≈ −43.27α 2 . This is consistent with the results in the previous section for long pulses, i.e. that NLO is needed to have a good precision even for very large a 0 . Moreover, we again find that by including NLO we have a good approximation already at notvery-large a 0 ; for a 0 = 2 we find for the b 0 -independent part {exact, LO, LO+NLO} ≈ {−7.9, −11.5, −7.7}α 2 , so LO+NLO is already close to the exact result at a 0 = 2.
A short pulse with compact support is also useful in order to demonstrate that the correction is nonlocal, because the part where the two integration variables are both outside the pulse but on opposite sides, i.e. P 13 , contributes to NLO. In this example we have which is a significant part of the total NLO.
Given that NLO is nonlocal, one might wonder if perhaps the a 0 scaling depends on the way the field goes to zero. For this reason we have also considered a(φ) = a 0 (1+cos φ) 2 for |φ| < π and a(φ) = 0 for |φ| > π, which has a different decay at φ = ±π. In one part of the calculation of NLO one finds that the dominant contribution comes from a region of the φ variables that scales differently in a 0 compared to the first example. However, we still find the same form as in (48); the only difference is the numerical coefficients.

G. Perturbation theory
In the previous two sections we studied the large-a 0 expansion and showed that by including NLO we obtain approximations that are good all the way down to a 0 2. In this section we study the small-a 0 expansion. In contrast to the large-a 0 expansion, it is quite straightforward to obtain a perturbation series in a 0 . We do not need to figure out how to rescale integration variables, we just have to expand the original integrand in a power series in a 0 and then perform the integrals at each order numerically. We consider again the compact, short field in (47). We find where L(a 0 ) and C(a 0 ) can be expanded in a power series in a 2 0 with coefficients with alternating sign and decreasing in absolute value, ] is ≈ 0.76 (reached in the P 12 region), which means that the singularity closest to the origin is at a 2 0 ∼ −1.5, so the radius of convergence is a 2 0 ∼ 1.5. This is also agrees with Fig. 4, where we compare an exact evaluation of (23) with the perturbation series. Fig. 4 shows that at a 0 ∼ 1 one can still obtain better precision by including more terms, but at a 0 ∼ 1.2 the perturbative sums deviate from the exact result regardless of how many terms one adds. (The radius of convergence for other processes in different regimes has been studied in [69][70][71].) So, perturbation theory seems to be limited to small a 0 ( 1.2 in this example), which is what one might have expected. However, there is growing interest in the field of extracting information encoded in perturbation series (around the origin in this case) to study different regions of parameter space, see e.g. [72][73][74][75] and references therein. In our case we have resummed the perturbation series into Padé approximants [76][77][78][79], where B 0 = 1, (in our case) A 0 = 0, and the other coefficients are obtained by expanding into a perturbation series and matching with (52) and (53). Padé approximants are sometimes used together with Borel resummation in order to treat asymptotic perturbation series [72,73], see [74] for an application to the Euler-Heisenberg effective action and Schwinger pair production. However, in this case we have a convergent series, so we apply the Padé method directly. In Fig. 5 we compare the first few (diagonal) approximants with N = M . These Padé approximants give a good agreement with the exact result beyond the radius of convergence all the way up to a 0 ∼ 7. So, even though we cannot use a direct sum of the perturbation series at 1.5 a 0 10, the behavior of the probability in this region is encoded in the coefficients of the perturbation series around a 0 = 0. Since we showed in the previous sections that the large-a 0 expansion (with NLO included) is good all the way down to a 0 2, we now have a significant overlap where the small-and large-a 0 expansions give numerically basically the same results, as illustrated in Fig. 6. This means that we have analytical approximations for any value of a 0 .
We can extend the reach of perturbation theory further using a conformal map [72][73][74][78][79][80] which maps the singularity at a 2 0 = −a 2 s to the unit cir- cle 5 . Instead of expanding in a power series in a 2 0 , we expand in z. Using only the conformal map also allows us to go beyond the radius of convergence; for the example in Fig. 7 we find agreement up to a 0 ∼ 20 by including terms up to z 30 . We can reach further if we perform a Padé resummation of the conformal series; in Fig. 7 we find agreement up to a 0 ∼ 50 with a Padé approximant with N = M = 14 (for the constant part; the log part is much better).
So, by calculating a sufficient number of the coefficients in the perturbation series around a 0 = 0, one can obtain a good approximation even for large a 0 . We are therefore led to consider large orders. A direct numerical integration can become challenging if we need to go to very high orders, but at sufficiently high orders we can use a semi-analytical approach. At O(a 2N 0 ), the most impor-5 The first singularity is at a 2 0 ∼ −1.5. In the plots we have chosen as to be the (numerically obtained) value of the first singularity, as is usually done. This is expected to be the optimal conformal map [81]. However, in this case this only gives a slight improvement compared to simply setting e.g. as → 1. In this particular case, this can be understood by noting that, while the singularity closest to the origin is determined by P 12 , the dominant contribution at large a 0 comes from P 22 . tant part of the integrand is given by which makes the integrand sharply peaked at the point where M 2 is at maximum. By exponentiating this factor as (cf. [82]) we can use saddle-point methods to perform the integrals 6 To leading order the rest of the integrand is simply evaluated at the maximum. Let L ijN and C ijN be the contributions from region P ij to the coefficients of L and C in (51) at O(a 2N 0 ). To leading order we find 6 Strictly speaking, the maximum is not necessarily a saddle point. For example, in the contribution from P 22 the maximum occurs on the boundary of the integration region, which means that, after a suitable choice of integration variables, we have one integration variable with linear rather than quadratic fluctuation around the maximum.
From these large-order approximations we see explicitly that we have convergent series. The ratio test gives of course the same radius of convergence as we found above.

IV. HEAVY MASS AND LOW ENERGY
In the previous section we saw that, in the high-energy limit, the "lightfront-time-instantaneous" P 11 contributes to leading order and is only smaller than the dominant term by a logarithmic term ln b 0 . This is interesting because P 11 gives in general only a small contribution. In this section we study a regime where P 11 alone gives the dominant contribution. So, consider the limit where the mass of the initial particle is much heavier than the mass of the pair, µ → ∞. This is a relevant limit since already a muon is much heavier than an electron. Our calculations could also be relevant for processes involving millicharged particles, see e.g. [83,84], but here we will focus on electrons and muons. In this limit the momentum of the initial particle does not change much during the process. So, we change variables from s 2 = (1−s 1 )u to u and from s 1 = 1/(1+t) to t, rescale t → x/µ and then take the limit µ → ∞. For P 22 the lightfront-time integrals for the first step become free, i.e. the background field enters only via the second step. We can therefore perform the σ 21 and θ 21 integrals. The first term in (9) vanishes because the θ 21 contour can be closed in the upper half of the complex plane and there are no poles there. This means that the two-step part P two does not contribute in the limit where the initial particle is much heavier than the pair. For the second term the σ 21 integral is trivial and gives The θ 21 integral can now be performed with the residue theorem, and then the r integral can be expressed in terms of an incomplete Gamma function. We recover exactly (33), so the infinite mass limit of muon trident agrees with pair production by the superposition of a Coulomb field and a plane wave, as expected.
The high-energy limit therefore just reproduces the result in the previous section, so we consider now instead the low-energy limit. We have b 0 /µ = ω and we consider ω 1. In this regime we can perform all the integrals with the saddle-point method. For the momentum integrals we have a saddle point at u = 1/2 and x = 2M (σ 43 , θ 43 ). We find and Although one could perform the remaining integrals numerically, that would not give a more accurate result than performing them with the saddle-point method, because we have already performed the momentum integrals with the saddle-point method. However, we can already see that P 11 gives the dominant contribution for arbitrary field shape, as P 22 ∼ ωP 11 P 11 . Contrast this with the case where all the masses are equal, where the opposite is true [1], i.e. P 11 ∼ χP 22→1 P 22→1 . Thus, here the lightfront instantaneous term P 11 gives the dominant contribution.
Note also that the saddle points for the lightfront-time integral are determined by the function θM , while in the equal-mass case as well as other processes such as nonlinear Compton scattering, Breit-Wheeler pair production or double nonlinear Compton scattering, the saddle points are instead determined by θM 2 [40]. In [40] we found explicit saddle-point approximations for an entire class of field shapes for processes with θM 2 in the exponent. However, these fields lead to transcendental equations for the saddle points here. It is, of course, still simple to obtain the saddle points numerically, expand around the saddle point and perform the resulting Gaussian integral analytically. For a 0 1 we can still find fully analytical results. We have M 2 = 1 + a 2 0ḟ (σ)θ 2 /12 and a saddle point at θ = i √ 6/(|a 0ḟ (σ)|), and we find where E(x + ) = ωa (σ/ω). For a constant field we recover Eq. (44) in [24].
We can also perform the remaining integral with the saddle-point method. If a 0 is only moderately large we need to include higher-order corrections to the leading order (67). By expanding in 1/a 0 we obtain where E = E(0) is the field maximum, and We have included one more order in the exponential part because even a small difference from the exact F can lead to a non-negligible difference in P due to the factor of 1/E 1 in the exponent. Note that all terms are local, they come from the region where the two lightfront time variables are close, which is seen from the fact that they are expressed in terms of derivatives of the field (evaluated at the maximum). This is what one can expect in a LCF expansion in 1/a 0 , but contrast this with the high-energy limit in the previous section, where the nextto-leading order corrections are nonlocal. Note also that we do not automatically have a local expression just because we can perform the lightfront time integrals with the saddle point method (which we can do as long as E is small enough), because, although the average variable σ = (φ 2 + φ 1 )/2 is in general forced to be close to the field maximum, for a 0 ∼ 1 we have a saddle point at θ = φ 2 − φ 1 ∼ i, so in that case the imaginary part of φ 2 and φ 1 do not have to be close, i.e. the result would be nonlocal.
In order to compare this expansion with the exact result, we consider a linearly polarized monochromatic field, a(φ) = a 0 sin φ. We have a saddle point with σ = 0 and θ determined by 7 For a 0 1 we have θ = i √ 6/a 0 , for a 0 1 θ = i ln(8/a 2 0 ). For a 0 1 the exponential part becomes 7 For comparison we note that for a circularly polarized monochromatic field the saddle point is determined by sinc θ = 1 + 1 which is the expected perturbative result since 2/ω photons have to be absorbed to produce the pair in the limit where the initial particle is very heavy. For a 0 ∼ 1 we can solve the saddle-point equation numerically. (The corresponding equation in the equal-mass case can be solved explicitly in terms of an inverse trigonometric function.) The result is compared in Fig. 8 with the corresponding approximation (68). We see that by including the first couple of terms in the 1/a 0 expansion we obtain a good approximation already for a 0 1.5. It is straightforward to obtain higher orders in 1/a 2 0 , but, as we see in Fig. 8 (where we include terms up to 1/a 26 0 ), there is a limit for how low a 0 that can be reached with a direct sum of the perturbation series in 1/a 2 0 . However, by resumming this series into a Padé approximant, we can reach much lower a 0 . So, we see that resummation methods can be useful for perturbation series in both a 0 and 1/a 0 .

A. Production of a muon pair
We can also consider the process where the initial particle is much lighter than the pair, for example an electron producing a muon pair. If a 0 = E/(m e ω) 1 it could still be that the muon nonlinearity parameter a µ = E/(m µ ω) is not large. Since the muon is much heavier than the electron µ ≈ 207 one can expect an exponential suppression, so we perform the integrals with the saddle-point method. We consider for simplicity a Sauter pulse f (φ) = tanh φ. For the two-step we find where Λ = (1+a 2 µ )arccota µ −a µ . For a µ 1 this reduces to which agrees with Eq. (24) in [24]. However, the one-step scales as and is therefore exponentially larger than the two-step. For a µ 1 we can neglect the third term. The first, leading term, − 4 √ 3µ 2 χ agrees with the result in [23]. So, this is another regime where the one-step dominates over the two-step. However, for the production of a muon pair by an electron we have 4 √ 3µ 2 ≈ 3×10 5 , so χ would have to be very large for this to not be completely negligible. It could therefore be more interesting to consider the opposite process, where there is a muon in the initial state with a µ ∼ 1 (which means a 0 1). We leave this for future studies.

V. LARGE χ FROM SMALL-χ EXPANSION
In this section we will study the χ dependence of the LCF result. In particular, we will show how asymptotic (divergent) power series in χ can be resummed using Borel-Padé-conformal methods [72,73,75,[78][79][80] to obtain a good approximation up to very large χ.

A. Nonlinear Breit-Wheeler pair production
We start for simplicity with nonlinear Breit-Wheeler pair production. In LCF the probability is given by (see e.g. [1]) where and We use χ γ = a 0 kl, where l µ is the momentum of the incoming photon, to distinguish it from the electron χ in trident. s = kp /kl is the fraction of the longitudinal momentum given to the produced electron. Ai(ξ) is the Airy function and We could consider a field with a locally constant χ γ (σ), but here we focus on the integrand R at a given value of χ γ . For small and large χ γ the probability is given by [85] χ γ 1 : R = 3 16 and χ γ 1 : The goal now is to obtain sufficiently many higher-order corrections to (80) in order to make a resummation that works up to χ large enough so that we have agreement with (81). We can do this by first expanding the Airy functions at large arguments, where x = ξ 3 2 = r/χ γ , and similarly for Ai (ξ)/ξ. The s integral can now be performed by expanding the integrand around the saddle point at s = 1/2 8 . We find R = 3 16 where We have calculated the first 56 terms, but it is not difficult or time consuming to obtain more terms. By plotting the ratio of neighboring coefficients T n /T n−1 , it is clear that they grow factorially. It is therefore natural to make a Borel transform We have a finite number of terms for BT. We resum this truncated series into a Padé approximant, PBT(t), which gives a ratio of two polynomial functions of t. The final result is now obtained by a Laplace transform In Fig. 9 we compare the direct perturbation series and the resummation with the exact result. We see that, at sufficiently small χ γ , the leading order (80) gives a good approximation, but as we increase χ γ it starts to deviate. Since the power series in χ γ is a divergent asymptotic series, a direct summation of higher-order terms does not help. However, the Padé-Borel resummed series gives an excellent agreement with the exact result already at Padé order M = N = 5. In fact, N = 5 works up to large χ γ : At χ γ = 100, the exact result is ≈ 0.0749, compared to ≈ 0.0755 for the resummed series. Going to N = 25 we find that the Padé-Borel resummed series has a large overlap with the large-χ approximation (81). This is what we wanted to see; the small-χ γ expansion gives divergent power series, but by resumming this series with Borel-Padé methods we obtain a good approximation all the way up to the region where the leading large-χ γ approximation becomes good.
In fact, we can obtain an even better agreement by performing a conformal transformation of the Borel transform before forming a Padé approximant, as described in [72,73]. By numerically matching the ratio of neighboring large-order coefficients of the Borel transform onto the following asymptotic form (cf. Richardson extrapolation in [87]) where BT n = T n /n!, we find that this ratio converges to c 0 = −3/8. This means that the Borel transform has a finite radius of convergence given by |t| < 8/3, and a singularity at t = −8/3. We therefore replace t in the truncated Borel series with the conformal variable z given by The next steps are to expand the resulting function in a power series in z to the same order, make a Padé approximant of the new series, and finally express z in terms of t. This gives a Padé-conformal resummation PCBT(t) of the original truncated Borel series [72,73]. The final step is to perform the Laplace transform in (86) with The result is quite impressive: At χ γ = 10 3 the relative error (R approx /R exact ) − 1 is {0.2, 0.01, 0.05, 5 × 10 −7 , 0.02} for {PB 5 , PB 25 , PCB 5 , PCB 25 , (81)}, where the subscripts 5 and 25 stand for the order N = M in the diagonal Padé approximant. We see that by includ-ing the conformal step, the relative error at N = 5 is on the same order of magnitude as N = 25 for the case without the conformal step. With N = 25 the conformal approximation gives an extremely high precision, with a relative error several orders of magnitude smaller than the large-χ γ approximation (81). At χ γ = 10 4 we have a relative error of {2 × 10 −4 , 4 × 10 −3 } for {PCB 25 , (81)}, so at such a very large χ γ the resummation PCB 25 still gives a very high precision and a relative error that is one order of magnitude smaller than the large-χ γ approximation (81). At χ γ = 10 5 we have a relative error of {7 × 10 −3 , 8 × 10 −4 } for {PCB 25 , (81)}, so PCB 25 still gives a relative error of less than one percent. We see that, while the large-χ γ approximation eventually gives a higher precision, this only happens at a very high χ γ . In fact, this only happens as αχ 2/3 γ becomes large, and then one would not trust the leading order in the α expansion anyway. So, if we limit ourselves to αχ 2/3 γ not large, then the resummation gives a remarkable precision over the entire range of χ γ .

B. Trident
We will now use the above resummation method for trident. In comparison with previous studies using resummation methods for Schwinger pair production [74,88,89], note that our expansion parameter χ gives the field strength in the rest frame of the initial electron in terms of the critical field. While the one-step part eventually becomes larger than the two-step part as the energy increases, in this section we will assume that a 0 and the pulse length are large enough such that the dominant contribution is given by the two-step part. In LCF this is given by where (see e.g. [1,28]) where ξ 1 = (r 1 /χ(σ 21 )) 2 3 and ξ 2 = (r 2 /χ(σ 43 )) 2 3 . This expression allows for a locally constant χ(σ), but we will for simplicity consider a constant field. For χ 1 we obtain as above For χ 1 we have (12). We can again obtain, without much work, the first ∼ 70 terms in T . We again find a series with factorially growing coefficients, so we use the Borel-Padé method. The results are shown in Fig. 10. We again find that the Borel-Padé method gives excellent agreement with the exact numerical result up to large values of χ.
Given the impressive improvement found in the previous section for nonlinear Breit-Wheeler by making a conformal transformation, one would of course also like to make a similar transformation for trident. However, for trident we find that the ratios of neighboring Borel coefficients, BT n /BT n−1 , does not converge (we have calculated > 70 coefficients). Instead, we find at large n a ratio that goes periodically through 4 different values {..., −0.48, −0.23, −0.07, +0.69, −0.48, ...}. This indicates the presence of complex conjugate pair of singularities on the radius of convergence [90]. Compare this with the Breit-Wheeler case, where the only convergence-limiting singularity is on the negative real axis. So, we cannot use the standard ratio test to determine the radius of convergence. One could still use Cauchy-Hadamard's theorem, which gives the radius from 1/r = lim sup n→∞ |BT n | 1 n , but this converges slowly [90]. A better approach is to use Mercer-Robert's procedure [91]: The radius of convergence is given by For the coefficients that we have calculated, B n has some small oscillations at large n, but we find that the radius of convergence is given by |t| ∼ 3.8. The Mercer-Robert's procedure also gives the positions of the conjugate pair of convergence-limiting singularities re ±iθ , from the n → ∞ limit of We find θ = 3π/4, so the singularities closest to the origin are at ∼ 3.8e ±3iπ/4 . We can confirm this by plotting Padé approximants 9 of the Borel transform. The Padé approximants do indeed have singularities at ∼ 3.8e ±3iπ/4 . The Padé approximants also show singularities on the real axis at t < −5, i.e. further away from the origin 10 . The fact that the singularities closest to the origin do not lie on the real axis might suggest using a different type of conformal map, e.g. as in [92]. However, we have found that a conformal map on the form (88) 25 , PCB 25 , (12)}. So, even for χ = 10 4 the relative error of this conformal map is on the same order of magnitude as the large-χ approximation (12). Since one might anyway want to keep αχ 2 3 from becoming large, this conformal map seems good enough for our purpose. So, we can obtain a good approximation at large χ by resumming the small-χ expansion. It is also interesting to note that the small-χ expansion is obtained by expanding around the saddle point where the three finalstate particles have the same longitudinal momentum s 1 = s 2 = s 3 = 1/3. The expansion coefficients around this point contain the information needed for large χ, even though the spectrum is sharply peaked at s 1 1 at large χ. So, we are expanding around a point at which the value of the spectrum is negligible compared to the spectrum's maximum at large χ.

C. Hypergeometric/Meijer-G resummation
In this section we will use some resummation methods [93][94][95] which are particularly suitable for functions with a branch cut. The first step is still to calculate the truncated Borel transform (85), but then this series is resummed using hypergeometric functions q+1 F q instead of the Padé-conformal methods. Assuming that we only have the perturbative information, then the resummation is taken as [94] HBT(t) = q+1 F q a, a 1 , ..., a q b 1 , ..., b q ; where a = 1 and the N = 2q + 1 constants a i , b i and t 0 are obtained by expanding the hypergeometric function in a series in t and matching with the Borel transform truncated at t N . In practice, this is conveniently done by [94] matching the first N ratios of the Borel coefficients, BT n+1 /BT n onto a Padé approximant in the variable n q i=0 and then comparing with the series-definition of q+1 F q . The Laplace transform can then be expressed compactly in terms of a Meijer-G function [96,97], For nonlinear Breit-Wheeler, this new resummation allows us to obtain a high precision at large χ γ with much fewer terms than with the Padé-conformal approach. Already at N = 5 we find a relative error of less than 5% for χ γ = 10 3 . At N = 7 we have encountered some instabilities that seem to be related to the fact that one a i is very close to one b i , i.e. we are close to a point where the hypergeometric and Meijer-G functions are reduced to a lower order. However, at N = 9 we find a relative error of less than 3 × 10 −3 at χ γ = 10 3 , and less than 8 × 10 −3 at χ γ = 10 4 . This is already an impressive improvement, but one can obtain a high precision with even fewer terms by using other known facts in addition to the perturbative data [78,95,98]. In our case, we know that the Borel transform has a singularity at t = −8/3, which can be used to set t 0 = −8/3. We could then let a be a constant to be obtained by matching in the same way as the other a i and b i . However, we also know the asymptotic scaling at large χ γ (81), R ∼ 1/χ 1 3 γ , which, together with the asymptotic limit of the Meijer-G function, can be used to fix one of the constants, e.g. a = 1/3 (the other constants will be larger so that a = 1/3 gives the leading asymptotic scaling). 2q + 1 terms are now needed to fix the constants in q+1 F q and the overall prefactor. Already at q = 1, i.e. with only 3 terms in the perturbation series, we find a relative error of 0.01 at χ γ = 10 3 , and 0.011 at χ γ = 10 4 . At q = 2 we again encounter an instability. At q = 3 we find a relative error of 1.9 × 10 −3 at χ γ = 10 3 , and 2.5 × 10 −3 at χ γ = 10 4 . And at q = 4 we find a relative error of 1.5 × 10 −4 at χ γ = 10 3 , and 2.0 × 10 −4 at χ γ = 10 4 .
Thus, the new hypergeometric/Meijer-G resummation methods allow for a high precision up to very large χ γ with relatively few terms from the perturbation series. However, this does not mean that we can forget about the Padé-conformal methods: The hypergeometric/Meijer-G resummation is particularly suitable for functions with a branch cut, but for trident we saw above that the Borel transform has a more complicated structure, with the radius of convergence limited by a complex-conjugate pair of singularities rather than one singularity on the negative axis. So, it is not a priori clear that the hypergeometric/Meijer-G resummation would work for the trident case. We have nevertheless tried it and found that with N = 3 (using only the perturbative data) the resummation is good up to χ γ ∼ 20. However, because of the instabilities mentioned above, we have not been able to extend this by increasing N . One could try take the second line in (97) as an ansatz and fix some of the constants by matching with the large-χ scaling (12), which might work since for e.g. a 1 − a 2 = 0 (or an integer) the large-χ limit of involves log terms (cf. [99]). However, we leave this to future studies. In the next section we will instead consider another new resummation method.

D. Confluent hypergeometric resummation
In this section we will use the resummation method introduced in [100]. It is similar to the usual Borel-Padé method, but allows us to use the large-χ scaling to improve the convergence. In this approach the resummed T is given by where φ is some suitably chosen special function and the 2n constants c i and χ i are determined by matching the χ-series expansion of the two sides. This requires the first 2n coefficients of T . The following function was proposed in [100], where U is the confluent hypergeometric function. A simple way [100] to find the the constants in (98) is to first calculate the [n − 1, n] Padé approximant of where φ k = (a) k (b) k /k!, and then χ i are given by the poles of this approximant (these are simple poles in our case) and r i are the corresponding residues. a = b = 1 gives the usual Padé-Borel resummation, but a and b can be chosen such that the large-χ limit of φ(χ) behaves as the known limit of T . In the Breit-Wheeler case, we can take a = 1/3 and b = 1, for which φ can be expressed in terms of an incomplete gamma function with asymptotic scaling φ(χ) ∼ 1/χ 1/3 , just as in (81). This choice leads to a resummation that seems competitive with the Meijer-G resummation: At n = 1, i.e. with only the first two terms in T , the relative error at χ = 10 3 and 10 4 is ∼ 0.02; at n = 4 the relative error is ∼ 10 −3 at χ = 10 3 and 10 4 . Note that the relative error is only slightly larger at χ ∼ 10 4 compared to χ ∼ 10 3 because the large-χ scaling is built into the resummation function φ. An advantage with this resummation is that it seem relatively fast. Another advantage is that we can also use it for trident.
In the trident case, we take a = b = 2/3, which gives a large-χ limit φ(χ) ∼ [ln(χ)+const.]/χ 2/3 , which matches the scaling of the leading term in (12) (but not const.). In this case we find that we need larger n compared to the Breit-Wheeler case. n = 1 gives nonsensical results. At n = 2 we find a relative error of {0.051, 0.066} at χ = {10 3 , 10 4 }. At n = 10 the relative error is {9.7 × 10 −4 , 1.5 × 10 −3 } for χ = {10 3 , 10 4 }. The relative error seems to decrease quite slowly as one increases n (also in the Breit-Wheeler case). However, this resummation still requires much fewer terms than the Borel-Padé-conformal method. We leave it to future studies to determine whether a significant improvement can be obtained by choosing a different φ (maybe a superposition of two U ) in order to match both the ln(χ)/χ 2/3 and the 1/χ 2/3 part of (12).
So, we have seen that the resummation methods introduced in [100] gives a significant improvement over the standard Borel-Padé-conformal method, both in the Breit-Wheeler and the trident case. This still does not mean that one can forget about the Borel-Padéconformal method, because in some cases the large-χ (or equivalent) limit might be unknown 11 .

VI. CONCLUSIONS
We have obtained new high-energy approximations in the regime where the energy parameter b 0 is the largest parameter and where the direct part of the one-step dominates over the two-step. Our high-energy approximation interpolates between the old literature result in the perturbative limit a 0 1 and previous result for pair production by the superposition of a Coulomb field and a 11 However, even if the large-χ limit is unknown, it could still be useful to make the large-χ scaling of the basis function explicit and then vary it until the best convergence is reached, as described in [78].
constant-crossed field in the a 0 1 limit. In between, for arbitrary a 0 , we find that the high-energy approximation of trident coincides with pair production by the superposition of a Coulomb field and a general, inhomogeneous plane wave.
Our high-energy approximation is the sum of a logarithmic term, ln b 0 , and a b 0 -independent term. We find that the logarithmic term can be obtained with a Weizsäcker-Williams equivalent photon approximation. Taking a 0 large usually means that the field can be treated as locally constant. However, taking first the energy parameter b 0 to be the largest parameter (our new approximation) and then taking a 0 large does not commute with first taking a 0 to be the largest parameter (standard LCF) and then taking b 0 (or χ) large. So, the fact that our new high-energy approximation agrees with the standard Weizsäcker-Williams approximation (to leading logarithmic order) explains why the latter does not agree with previous LCF results.
Another interesting difference from the LCF regime is that, while the leading order in the large-a 0 limit of the large-b 0 approximation is local (similar to the LCF regime), the next-to-leading-order correction is nonlocal, i.e. it is given by an integral where the two lightfronttime variables are not forced to be close but can be far apart. In fact, an important contribution to this correction for compact fields comes from the region where both lightfront-time variables are outside the field but on opposite sides. This is a signal that the formation length is longer in the high-energy limit compared to the large-a 0 limit.
We have also showed that in the case where the initial particle is much heavier than the pair, the dominant contribution in the low-energy limit is given by the term in the amplitude that comes from the instantaneous part of the lightfront Hamiltonian.
We have used Borel, Padé and conformal methods to resum perturbation series in a 0 , 1/a 0 and χ. The use of Padé approximants for analytical continuation of perturbation series beyond their radius of convergence has a long history in physics. Here we have shown that our new high-energy approximation has a finite radius of convergence in a 0 , but by forming Padé approximants we can go beyond this radius of convergence and obtain a good agreement with the large-a 0 approximation in an interval of intermediate a 0 values. By making a conformal transformation before Padé resummation we obtain agreement with the large-a 0 approximation up to much larger a 0 . We have also used a Padé approximant to analytically continue a power series in 1/a 0 in the low energy (saddle-point) regime. Finally, we have considered the χ dependence of nonlinear Breit-Wheeler pair production and the two-step part of trident in the LCF regime. At small χ the probability can be expanded in a power series in χ times a "Schwinger-like" exponential e −const./χ . This power series diverges, so we use a Borel transform to obtain a convergent series. Then we use Padé and conformal methods to analytically continue the truncated Borel transform as described in [72][73][74]. This gives us resummation of the originally divergent χ series into an approximation that agrees with the exact result up to very large χ, with a significant overlap with the leading large-χ approximation. We have also showed that newer resummation methods [94,100], which are based on hypergeometric/Meijer-G or confluent hypergeometric functions, can significantly reduce the number of terms that have to be calculated in order to get a certain precision. It would be interesting to further study these sorts of resummation methods for other strong-field processes and for other fields and parameter regimes.