Accelerating the convergence of free electron laser simulations by retrieving a spatially-coherent component of microbunching

A simple method to reduce the numerical cost in free electron laser (FEL) simulations is presented, which is based on retrieving a spatially-coherent component of microbunching to suppress artifact effects that can potentially overestimate the FEL gain; this significantly reduces the number of macroparticles to reach the numerical convergence and enables the direct computation of amplified radiation without solving the wave equation. Examples of FEL simulations performed to demonstrate the proposed method show that the computation time to get a reliable result is reduced by 1-2 orders of magnitude depending on the simulation condition.


I. INTRODUCTION
Numerical simulation is an important and indispensable tool to quantitatively evaluate the performances of free electron lasers (FELs), in which a set of equations describing the process of FEL amplification are numerically solved to quantify the formation of microbunching and amplification of radiation by the microbunched electron beam.These FEL equations fall into two groups; one describes the evolution of radiation copropagating with the electron beam, while the other gives the motion of an electron traveling along the undulator and interacting with the radiation.
The former equation, which is derived from the Maxwell's wave equation, has the form where z and r ⊥ = (x, y) are longitudinal and transverse coordinates, t is time, c is the speed of light, and ω is the angular frequency of radiation.The amplification of radiation is described by the growth of the complex amplitude E, and the source term S drives the amplification process; the source term is defined by the spatial and temporal distributions of the electron beam together with its trajectory in the undulator, as will be discussed later in detail.
The latter equation gives the transverse position (x j , y j ), angle (x ′ j , y ′ j ), time t j , and energy γ j of a specific (j-th) electron.Ideally, the equations are solved for all the electrons in the electron beam, which is usually too time consuming.Thus, instead of such a straightforward method, the electron beam is usually modeled by a collection of "macroparticles," each of which represents a large number of electrons.In the simulation of self-amplified spontaneous emission (SASE) FELs, the macroparticles are initially distributed to correctly model the shotnoize in the electron beam (quiet loading) [1].
To solve the wave equation (1) numerically, the source term S is discretized in the spatiotemporal coordinates; the electron beam is temporally divided into "slices," each of which is further divided into "cells" in the transverse plane; the source term is then evaluated at the center of respective cells (grid points), by adding the contribution from each macroparticle over those contained inside the cell.Note that the length and interval of a slice are usually defined as a multiple of wavelength of radiation, so that the wave equation can be temporally averaged and modified to a convenient form to be numerically solved.
In the method to solve the wave equation using the macroparticle model explained above, we have two difficulties.First, the number of macroparticles should be sufficiently large to obtain a reliable simulation result, which potentially requires a long computation time.
Second, the number of macroparticles needed for numerical convergence is sensitive to the simulation condition; for example, it grows rapidly as the FEL gain is reduced as shown later.
To tackle the above issues, it is important to explore a scheme to accelerate the numerical convergence in FEL simulations.In this paper, we present a simple method for this purpose, which is based on retrieving a "spatially-coherent" component in the transverse microbunching profile to suppress artifact effects coming from an insufficient number of macroparticles.
We also show that this method can be applied to reduce the numerical cost and computation time; the evolution of radiation can be easily described without rigorously solving the wave equation.

II. NUMBER OF MACROPARTICLES FOR NUMERICAL CONVERGENCE
Let us first discuss the number of macroparticles to reach the numerical convergence in FEL simulations.As an example, we assume an electron beam whose temporal and spatial distribution functions are given by Gaussian functions, which drives a SASE FEL at the photon energy of 10 keV with the conditions summarized in Table 1.For reference, the number of electrons actually contained in the simulation time window of 18 fs is around 4 × 10 8 .The FEL simulations have been performed with the computer code SIMPLEX [2] and GENESIS [3]; to evaluate the numerical convergence, simulations have been repeated with different numbers of macroparticles; the results are shown in Figs.1(a) and 1(b), where the growth of the pulse energy along the undulator, or a gain curve, is plotted for each condition.
The results with the two different simulation codes are consistent with each other, and are sensitive to the number of macroparticles in terms of the gain length and saturation power.
It is reasonable to say that we need 10 7 macroparticles to reach the numerical convergence in this condition.
It should be emphasized here that the numerical convergence discussed above is not universal, but is sensitive to a variety of factors.Among them, the harmonic number is one of the most sensitive.For example, Figs. 1(c) and 1(d) show the gain curves for the 3rd harmonic radiation simulated with the same conditions as the above; obviously, we need much more macroparticles than the fundamental radiation, and 10 8 macroparticles or more are needed to reach the numerical convergence, which is close to the real electron number of 4 × 10 8 .
It is worth noting that simulations with fewer macroparticles always overestimate the FEL gain; as discussed later in detail, this is attributable to artifacts in the source term S in equation (1), which comes from an insufficient number of macroparticles.In this paper, a numerical method is presented, which accelerates the numerical convergence by correctly evaluating S with much fewer macroparticles than what is usually required.

III. ANALYTICAL FORMULATION A. Overestimation of the FEL gain by an insufficient number of macroparticles
To investigate how the insufficient number of macroparticles results in an overestimation of the FEL gain, let us turn to the source term of the wave equation.According to a previous paper [2], it has an explicit form where Z 0 is the vacuum impedance, I denotes the beam current, β ⊥j and ψ j are the transverse velocity and phase of the j-th macroparticle.The bracket and subscripts z and t denote averaging over the macroparticles contained in a time range [t − T /2, t + T /2] at the longitudinal position z, with T being a temporal window for averaging.In the code SIMPLEX, the wave equations are solved by spatial Fourier transform and the radiation field is composed of two terms as where a new independent variable s = z − vz t has been introduced instead of t, with vz denoting the longitudinal velocity of a reference electron averaged over one undulator period.
It is obvious that s refers to the longitudinal position in the electron beam measured from the reference electron.
The first term E d describes the diffraction of radiation, and is given by where ∆z is a step interval to solve the wave equation and ∆s = (1− vz /c)∆z corresponds to the slippage length of radiation while the electron travels the distance ∆z in the undulator.
The operator F s denotes the spatial Fourier transform with respect to r ⊥ , and k denotes the Fourier conjugate of r ⊥ .
The second term E g describes the amplification of radiation, and is given by with where γ 0 and β 0⊥ denote the Lorentz factor and transverse velocity of the reference electron.
Equation ( 5) means that the amplification of radiation is driven by two factors: F (z, θ) and δ(r ⊥ − r ⊥j )e −iψ j z,s .The former corresponds to the field amplitude of radiation, which is generated by a single electron while it travels from z − ∆z to z, and is emitted to the observation angle θ.The latter represents the microbunching at the transverse position r ⊥ and slice position s.
For numerical operation of the above equations, we introduce a "local bunch factor" as where ∆x and ∆y are the horizontal and vertical intervals of the grid points, M is the total number of macroparticles, and the summation is taken over macroparticles whose longitudinal coordinate s j satisfies the condition |s j − s| ≤ ∆s/2.The local bunch factor obviously specifies the strength and phase of the microbunching averaged over a single cell, and we have a relation where N(s) is the number of macroparticles satisfying the condition |s j − s| ≤ ∆s/2.Then, equation ( 5) reduces to where q is the total charge contained in the simulation time window.
For later discussions, we introduce a slice bunch factor to represent the microbunching in a target slice, and slice bunch power to evaluate the growth of the radiation power in that slice.electron beam passes through 3 undulator segments (from 2nd to 4th).Except for the spikes intrinsic to the SASE process, M does not have a large impact on the maximum value and temporal average of |B|; this suggests that M = 10 6 is sufficiently large as far as the slice bunch factor B is concerned.
To clarify the reason why M = 10 6 overestimates the FEL gain, let us investigate how the spatial profile of the bunch factor varies as the simulation conditions.For comparison between different conditions, we introduce a normalized local bunch factor b As found in Fig. 3(a), b at z = 5 m with M = 10 8 oscillates rapidly and randomly within an envelope defined by the distribution function of the electron beam, or a 2-dimensional (2D) Gaussian function with the RMS widths of σ x and σ y .This means that the bunch factor at the exit of the 1st undulator segment is dominated by the shotnoize of the electron beam.The above discussion also applies to the simulation result with M = 10 6 as found in Fig. 3(b); however, the maximum value is considerably larger than that with M = 10 8 .
Recalling that the slice bunch factor |B| is similar in each condition, it is obvious thus this difference comes from the fewer macroparticles contained in a single cell.plane.This means that the longitudinal positions (s) of the macroparticles contained in the same slice are close to each other, regardless of their transverse positions (r ⊥ ).Thus, the microbunching in this condition is spatially in phase, or its "spatial coherence" is high.it can be artificially higher than what is really reached in those with a reasonable number of macroparticles.Because of such a spiky structure of b, the slice bunch power P evaluated with M = 10 6 is about 3 times higher than that with M = 10 8 , although the slice bunch factor |B| is similar; as a result, the simulation with M = 10 6 overestimates the radiation power and FEL gain.

B. Retrieving a spatially-coherent component of microbunching to suppress artifact effects
To suppress the overestimation of the FEL gain due to an insufficient number of macroparticles, we need to retrieve the "true" spatial profile of the local bunch factor, as found in Fig. 3(c).For this purpose, let us assume that b can be decomposed into two terms, where b C denotes the spatially-coherent component and is a slowly varying function of r ⊥ , while b S denotes the incoherent one and rapidly oscillates as r ⊥ ; note that the argument z has been omitted because it is fixed in the following discussions.When integrated over r ⊥ , b C reduces to B, while b S vanishes.Obviously, the incoherent term b S has no impact on the amplification process because radiation coming from this component rapidly diverges because of diffraction, and thus can be omitted from equation ( 10) to describe the evolution of radiation.In addition, it is natural to assume meaning that the cross term b C b * S vanishes when integrating over r ⊥ .Now let us assume that b C is given by a simple 2D Gaussian function where κ is a scaling factor to define the horizontal and vertical RMS sizes of the 2D Gaussian with where L q is a Laguerre function of the q-th order.Recalling the orthonormality of Ψ q , the coefficient g q is given by where equation ( 8) has been used.Now we have Comparing equations ( 16) and ( 21), we can define a maximum order Q of the LG function comprising the coherent term, namely while the incoherent term is given by Under the assumption ( 17) and averaging over the whole slices, we have where Q should be determined to be consistent with the assumption that b C is a slowly varying function of r ⊥ ; to be more specific, Q should be reasonably low, because b C approaches to b for q → ∞.
To look for the reasonable value for Q, let us consider a simple case when b S (r ⊥ , s) = 0 and thus the coefficient g q is given by and examine how g q varies as the order q and scaling factor κ. Figure 4 shows |g q | 2 numerically evaluated for different values of q in a practical range of 0.5 ≤ κ ≤ 1, normalized by the 0-th order.Because of the orthogonality of the LG function, g q vanishes for q ≥ 1 and κ = 1, and not shown explicitly.
The numerical results show that |g q | 2 drops rapidly as q, and the contribution from the 2nd and higher orders is negligibly small; even with an extreme case when the transverse size of the local bunch factor is half of that of the electron beam (κ = 0.5, black), |g 2 | 2 is two orders of magnitude lower than the fundamental component.Then we can conclude that Q = 1 is the reasonable choice to define κ and thus we finally have LG Function Order q κ=0.5 FIG. 4. Square of the coefficient g q for the q-th order LG function to expand a simple 2D Gaussian function for different scaling factors κ.Now, the spatially-coherent component can be retrieved from equations ( 17), ( 20) and (26), using the distribution of the macroparticles.Then we have with X = κkσ x and Y = κkσ y denoting the normalized RMS widths of the local bunch factor in the horizontal and vertical directions.This formula can be used in (10) to describe the amplification of radiation.Because it does not contain any artifacts coming from an insufficient number of macroparticles, it is expected that the overestimation of the FEL gain is avoided and the numerical convergence is reached with much fewer macroparticles.
C. Direct calculation of the radiation field without solving the wave equation Besides accelerating the numerical convergence, the numerical method presented in the previous section can be applied to reduce the computation time by taking advantage of analytical formulas instead of performing a number of numerical processes.
Using equations ( 10) and ( 27), we have with where subscripts n in X and Y indicate that they should be evaluated at the longitudinal position z = z n .Equation ( 28) describes the amplification of radiation at the m-th slice and n-th step in the angular domain.
To simplify the above formula, we compare the two functions F (z n , θ) and G n (θ) in terms of the angular divergence.It is easy to understand that the typical divergences of F and G n are γ −1 0 and X −1 n (or Y −1 n ); the former is much larger than the latter as long as the electron beam size satisfies the condition σ x,y ≫ γ 0 λ/(2πκ) with λ being the wavelength of radiation, which is usually the case for an electron beam in FELs.Then it is reasonable to make an approximation and we have denoting the on-axis (θ = 0) radiation field in the angular domain, which is defined once the slice bunch factor is obtained.
The angular representation ε p,m (θ) ≡ F s [E(r ⊥ , z p , s m )] at the m-th slice and p-th step is evaluated by summing up the contributions from the 1st to p-th steps.Recalling the slippage and diffraction effects, we have with ẑpn = k(z p −z n ) denoting the normalized distance.Because of its manageable form, the above formula significantly reduces the numerical cost to describe the amplified radiation.
For example, the instantaneous radiation power P p,m given by contains a spatial integral that can be analytical performed as follows meaning that the instantaneous radiation power is evaluated by summing up a number of terms instead of the 2-dimensional integral.Another example is the radiation field needed to compute the energy variation of each macroparticle interacting with radiation; inverse Fourier transform of (33) yields the radiation field E(r ⊥ , z p , s m ) at the m-th slice and p-th step, which can be analytically performed to give meaning that we do not have to solve the wave equations ( 4) and ( 5).

D. Extension to high harmonics
Although the discussions so far are focused on the fundamental radiation (λ = λ 1 , where λ 1 is the fundamental wavelength of radiation), they can be applied to high-harmonic radiation by modifications λ → λ 1 /h and ψ → hψ, with h being a harmonic number.It should be noted, however, that the approximation (30) is not valid for even-harmonic radiation that is canceled out in the forward direction θ = 0, and we have F (z, 0) = 0 and ε p,m (θ) = 0, meaning that the amplification of even harmonics cannot be described correctly.
To overcome the above difficulty, we make an approximation for even harmonics, to yield and a similar expression for e y;n,m .The above formula makes it possible to derive analytical formulas for the radiation field and power, E and P p,m .For example, P p,m contains a spatial integral that can be analytically performed as follows the following discussions, the simulation results without and with the proposed method are refereed to as (A) and (B), respectively.Figure 5 summarizes the gain curves for the 1st, 2nd and 3rd harmonics, with the alphabets and numbers indicating the simulation method and harmonic order.For example, (A-1) shows the result for the fundamental radiation without the proposed numerical method.
FIG. 1. FEL Gain curves simulated with (a,c) SIMPLEX and (b,d) GENESIS, using different numbers of macroparticles for the (a,b) fundamental and (c,d) 3rd harmonic radiation.
Now let us turn to how the microbunching evolves during amplification, and how it depends on the number of macroparticles.As an example, we consider bunch factors evaluated with M = 10 8 and M = 10 6 in the simulations presented in section II, at two different longitudinal positions of z = 5 m (exit of the 1st segment) and z = 23.45m (4th segment).To facilitate the following discussions, let (a) specify the condition of z = 5 m and M = 10 8 , (b) z = 5 m and M = 10 6 , (c) z = 23.45m and M = 10 8 , and (d) z = 23.45m and M = 10 6 .

Figure 2
Figure2shows the temporal profiles of the slice bunch factor |B| in the four different conditions (a)-(d); we find that |B| is enhanced by nearly two orders of magnitude after the Figures 3(a)-3(d) show the spatial profiles of the normalized local bunch factor b for the four different conditions (a)-(d), respectively.Note that b is averaged over 5 slices around the target slice position defined above, which corresponds to the coherence length [4] (8 nm) in the condition under discussion; this is to improve the visualization without modifying the characteristics of b.The ellipse in the contour plot indicates the area defined by ±3σ x and ±3σ y , where σ x and σ y are the root mean square (RMS) electron beam sizes in the horizontal and vertical directions, with σ x = 0.014 (0.022) mm and σ y = 0.022 (0.014) at z = 5 (23.45) m.

Figure 3
Figure 3(c) is the same as 3(a), but at z = 23.45m; through the amplification process, the maximum value of |b| is enhanced by a factor of ∼30 compared with that at z = 5 m.What is more important is that the spiky structure in the early stage of amplification found in Fig. 3(a) almost disappears, and roughly speaking, b is overall positive in the whole transverse

Figure 3
Figure 3(d) is the same as 3(b), but with M = 10 6 .Although it reproduces the result with M = 10 8 in terms of the constant phase over the transverse plane and enhancement factor of ∼30, the spiky structure still remains.This obviously comes from an insufficient number of macroparticles; b vanishes in cells where no macroparticles are contained, while function.The validity of this assumption is evident from Figs. 3(c); except for the spikes, b is obviously represented by a 2D Gaussian function, whose aspect ratio is the same as that of the spatial distribution function of the electron beam.To evaluate the scaling factor κ, we expand b by Laguerre-Gaussian (LG) functions as follows b ;n,m−p+n θ x + e y;n,m−p+n θ y )G n (θ) exp − iẑ pn θ 2

TABLE I .
Parameters assumed in the simulation examples.