Minimizing longitudinal distortion in a nearly isochronous linear nonscaling ﬁxed-ﬁeld alternating gradient accelerator

Linear nonscaling FFAGs (ﬁxed-ﬁeld alternating gradient accelerators) are machines that use linear magnets to achieve an extremely large energy acceptance (generally a factor of 2 or more). This paper examines the longitudinal dynamics in such a machine, focusing on the longitudinal acceptance, the phase space area that is transmitted without excessive distortion. The paper shows how to compute the distortion in two ways: computing the emittance growth, and computing the distortion of an initial ellipse from an elliptical shape. The paper will describe a model for the longitudinal dynamics in a linear nonscaling FFAG, show how to compute the longitudinal distortion in such a machine using a Dragt-Finn factorization, examine the accuracy of the calculation, and describe how longitudinal acceptance can interact with other performance criteria for an FFAG.


I. INTRODUCTION
Fixed-field alternating gradient (FFAG) accelerators are machines which accelerate over a large range of energy (generally a factor of 2 or more) without varying the magnet fields. Since the magnet fields do not vary during acceleration, acceleration is potentially very rapid. FFAGs were first studied and built in the 1950s [1][2][3], but little further work was done on these machines until very recently. In the last few years, two FFAGs have been built in Japan [4 -8], and FFAGs have been proposed there for several other projects.
All of the FFAGs built until now have been what are today called ''scaling'' FFAGs. They are called scaling because of an important property of the machine: after applying a particular energy-dependent linear transformation to the phase space variables, the dynamics of the machine become completely independent of energy. In particular, the tunes and momentum compaction are independent of energy, and the energy-dependent closed orbits are geometrically similar. The energy-independent tune and the energy independence of the phase space (except for an energy-dependent linear transformation) allows one to choose a good working tune, as in a synchrotron, and one will then avoid resonances over the entire acceleration process.
There are of course difficulties with scaling FFAGs. The time of flight depends strongly on energy, even when the velocity is nearly the speed of light, so one must either vary the rf frequency to match the time of flight, which limits the acceleration rate, or one must fix the rf frequency and accelerate so quickly that the bunch does not leave the rf crest. The large energy range and the machine's nonzero dispersion require large apertures. Finally, the highly non-linear magnets required for a scaling FFAG can potentially limit the dynamic apertures for those machines.
A new type of FFAG, the linear nonscaling FFAG [9,10], was proposed to improve the performance of FFAGs with respect to these difficulties, particularly for muon acceleration. Muons must be accelerated very rapidly to avoid decays, yet one desires to make as many passes through the rf as possible to use the very expensive rf cavities more efficiently. The rate of acceleration prevents the rf frequency from being varied, and thus the range in time of flight must be minimized. Linear nonscaling FFAGs for muon acceleration applications make the machine as isochronous as possible by making the machine isochronous somewhere close to the middle of the energy range of the machine (as in Fig. 1). In a scaling FFAG, the momentum compaction is a nonzero constant, leading to a larger timeof-flight range than for a comparable nonscaling FFAG. Muon accelerators also require an exceptionally large dynamic aperture, so to improve that, the linear nonscaling 10 12 14 16 FFAGs use only linear magnets, as opposed to the highly nonlinear magnets required for a scaling FFAG. Finally, the nonscaling nature of the FFAG means that orbits will not be geometrically similar as in the scaling FFAG. This extra freedom can be used to make the apertures smaller in some of the magnets.
As can be seen in Fig. 1, the time of flight in a linear nonscaling FFAG will be well approximated by a parabola. As can be determined from the shift of the minimum from the center in that figure, cubic order terms only contribute around 10% to the time of flight within the energy range. The characteristics of the motion in phase space are determined qualitatively by this parabolic shape, and the underlying causes of the phase space distortion examined here are present when the time of flight is approximated by a parabola.
Instead of using linear magnets, one can use nonlinear magnets and make a nonscaling FFAG lattice almost perfectly isochronous [11,12]. Higher-order effects (such as the time-of-flight dependence on transverse amplitude) and errors will inevitably make the isochronicity imperfect, but one can certainly reduce the time-of-flight range substantially from what can be achieved in a linear nonscaling FFAG. While this may come at the cost of dynamic aperture (due to nonlinearities) and magnet cost, such a machine requires consideration. This paper will more directly address the case of linear nonscaling FFAGs, where the time of flight as a function of energy is nearly parabolic; the results from this paper can be useful in comparing the machine performance of linear nonscaling FFAGs to these other types of nonscaling FFAGs. This paper describes the longitudinal dynamics in a machine where the time of flight is exactly a parabolic function of energy, with the minimum of the parabola at the center of the energy range, and the rf voltage is a sinusoidal function of the phase. In the model, the rf voltage and the time of flight advance are both distributed uniformly around the ring. This system has been examined previously [13,14]. A primary constraint on the design parameters for these FFAGs will be the amount of longitudinal phase distortion one can tolerate. This paper will quantify this distortion and describe a method for computing it for this system.

II. HAMILTONIAN IN SCALED VARIABLES
The FFAG ring is assumed to consist entirely of identical cells, all of which contain an rf cavity. I take the further step of ignoring the longitudinal variation within the cell, making a continuous approximation of the system. The time of flight is approximated here to be a parabolic function of energy E: where the machine is designed to accelerate from E i to E f , E E f ÿ E i , T is the difference between the time of flight per unit length at E i (or E f ) and the central energy, s is the distance along a reference curve which defines the coordinate system, and T 0 is chosen so that d=ds is zero when the phase of a particle in the rf cavities does not change from one cell to the next. The energy gain in the cavities is sinusoidal: where V is the average energy gain per unit length for oncrest acceleration in a cell, and ! is the angular rf frequency. The choice of placing the minimum of the parabola at E i E f =2, and the fact that there is no phase in Eq.
As an example of what these parameters might be like for a real machine, consider a muon FFAG design: the machine would accelerate from 10 to 20 GeV, using 201.25 MHz superconducting rf. One design [15] has 91 4.7 m long combined-function doublet cells. With a gradient of 10 MV=m in the cavities, V is about 1:5 MeV=m. T is about 1:4 ps=m in this lattice.
One can perform the change of variables to get the new equations of motion dx du 2p ÿ 1 2 ÿ b; dp du a cos x; where Note that b is a relatively simple quantity to adjust in a machine design: a small change of frequency, phase relationship between cavities, or cell length can generally make b any desired value. a is generally more costly to vary: adjusting it requires that the amount of rf voltage be changed, that the number of cells be changed (to change T [16]), or that the cell length (to change T [16]), rf frequency, or energy range be changed significantly. The muon accelerator parameters described above came from a lattice designed for a 1=12.
In these scaled variables, the goal of the machine is to accelerate from p 0 to p 1. This scaled system is governed by the Hamiltonian Henceforth, I will work with this scaled Hamiltonian. When b > 0, this Hamiltonian has unstable fixed points at x =2 and p 1 b p =2. The value of the Hamiltonian on the corresponding separatrices is The separatrices delineate regions of phase space which cannot be crossed. Figure 2 shows the separatrices dividing the phase space. From the Hamiltonian values on the separatrices and an examination of the phase space it can be seen that there will be no region of phase space con- there are no separatrices. Thus, to accelerate from p 0 to p 1 there is no restriction from the separatrices if b < 0. Furthermore, there must be at least one trajectory crossing both p 0 and p 1. By symmetry, this means that the trajectory passing through x 0 and p 1 2 must pass through p 0 and p 1. It must be possible for the Hamiltonian to be zero when p 0 and p 1.
The combination of these restrictions leads to Note that as a result, there is an absolute restriction that a > 1=24. The region of the a-b parameter space that permits acceleration from p 0 to p 1 is shown in Fig. 3.

III. LIE ALGEBRAIC FORMULATION
To analyze the phase space transmission in this system, I will write the map for evolution in u in Lie algebraic form, using the Dragt-Finn factorization [17,18]. In this paper, I will analyze the map about the central trajectory, passing through x 0 and p 1=2. It is convenient to start with the map from x 0, p 1=2 forward in u. Its Lie factorization can be written as M half e ÿ:g 1 : e :f 5 : e :f 4 : e :f 3 : e :f 2 : e :f 1 : : (11) g 1 x=2 is the operator that translates to the initial conditions x; p 0; 1=2. The remaining operators will be computed through numerical integration as described in [18]. The map from beginning to end is just where C x p ÿx 1 ÿ p : (13) Note that the operator order follows the Lie algebraic convention of first to last being left to right. Writing C in Lie algebraic notation, where F is the reflection operator Note that F 2 is the identity operator, as is C 2 . The combined map can thus be written  I use the transformation rule for similarity transformations on Lie operators, the fact that F leaves even order homogeneous polynomials invariant, and changes the sign of odd order homogeneous polynomials, that first-order Lie exponentials can be combined by adding their exponents, and that g 1 x=2. The final approximation is appropriate since the factorization was truncated at fifth order already, and including the effects of f 3 and f 4 on f 5 would result in terms higher order than fifth. Note that the linear part of the map is the identity and that any f 4 terms disappear from the final map. Note that if the time of flight were not perfectly parabolic or the rf waveform were asymmetric, the linear part would not be the identity (which will not matter for these results) and there would be a nonzero f 4 term.

A. Quantities to analyze
To give optimal performance of a machine, we will want to minimize some quantity which is related to the deviation of the machine from linearity. Different types of machines will require the minimization of different quantities. For colliders, the rms energy spread and bunch length at collision are the important quantities. Thus, the machine should minimize the emittance growth for that case. For a neutrino factory, however, the machine is generally designed to transmit a certain phase space volume and not much more. In that case, one wishes to minimize the growth of the boundary of an elliptical phase space volume.
One wants to find the effect of the FFAG on an elliptical distribution. A translation has no effect on the distribution shape, so one only needs to analyze the effect of In most cases, I will try to find the optimal ellipse orientation (i.e., I will try to choose A) so as to minimize the quantity of interest. I will need to compute the effect of M C on a phase space vector to third order; this is ÿg 31 x 2 ÿ 2g 32 xp ÿ 3g 33 p 2 3g 30 x 2 2g 31 xp g 32 p 2 ; (21) z 3 1 2 g 3 ; g 3 ; z 1 g 2 31 ÿ 3g 30 g 32 x 3 g 31 g 32 ÿ 9g 30 g 33 x 2 p g 2 32 ÿ 3g 31 g 33 xp 2 g 2 31 ÿ 3g 30 g 32 x 2 p g 31 g 32 ÿ 9g 30 g 33 xp 2 g 2 32 ÿ 3g 31 g 33 p 3 : There are also fourth order terms, but they are not needed for the initial computations, so I will delay considering them. Because of the presence of the transformation A, I will examine transformations of circles and circular distributions under M C . To optimize the design, I will vary the transformation A (which has two free parameters) to minimize the quantity of interest for a given a and b. I will then find the b which minimizes the quantity for a given a (since, as noted above, it costs very little to change b), producing a plot of the quantity versus a.

Emittance growth
To compute the emittance, we assign and average over an arbitrary distribution in J and a uniform distribution in . The emittance is defined to be the square root of the determinant of the covariance matrix hzz T i ÿ hzihzi T (hfi is the average of the quantity f over the distribution, and z is the phase space vector). For z z 1 , the emittance is hJi.
The fourth order moments needed are Thus, the emittance growth is 3 4 hJ 2 i9g 2 30 5g 2 31 5g 2 32 9g 2 33 ÿ 6g 30 g 32 ÿ 6g 31 g 33 ÿ 1 2 hJi 2 g 31 3g 33 2 g 32 3g 30 2 : For the analysis, I will restrict the discussion to hJ 2 i > 4=3hJi 2 . The minimum emittance growth as a function of a is shown in Fig. 4. For a close to its minimum value of 1=24, the emittance growth is proportional to a ÿ 1=24 ÿ2 . Larger values of hJ 2 i=hJi 2 give larger values for the emittance growth. For a given values of hJ 2 i=hJi 2 , the emittance growth is proportional to hJi 2 .
The optimal b is shown in Fig. 5. The reason for the sudden change near 0.41 is shown in Fig. 6. For small b, the minimum emittance growth occurs at the smallest allowed b. The optimal b is independent of the ratio hJ 2 i=hJi 2 .
If the time of flight were not perfectly parabolic (as in the real case shown in Fig. 1), there would potentially be relative corrections of the order of the relative size of the difference from a parabola, due to the distortion of the orbit about which the expansion is performed and changes to the derivatives about that. The addition of f 4 terms to the map will give no contribution to the emittance growth. Specific quantitative results will be left to a subsequent paper.

Boundary distortion
If instead we wish to minimize the distortion of a circle due to M C , first compute the change in the radius of the circle under M C to lowest order: Replacing x and p by J and using Eq. (24), this becomes ÿ4 2J 3 p g 31 cos 3 3g 30 ÿ 2g 32 cos 2 sin 3g 33 ÿ 2g 31 cos sin 2 g 32 sin 3 : To find the extrema of this, take the derivative with respect to : To find the zeros of this, divide by cos 3 or sin 3 , and you will get a third order polynomial in tan or cot respectively. The zeros of this can be found analytically, and the resulting zeros used to find the extrema of Eq. (29). Note that the distortion of the radius of the circle is proportional to the square of the radius of the circle. Figure 7 shows the minimum ellipse distortion as a function of a. The value of J=2J 3=2 is proportional to a ÿ 1=24 ÿ1 for a close to 1=24.
The b which gives this minimum distortion is identical to the b which gives the minimum emittance growth that was shown in Fig. 5.
The comments in the subsection on emittance growth regarding a time of flight which is not perfectly parabolic also apply to the boundary distortion calculation.

Boundary distortion with linear transform
If one is interested only in the distortion of the outer boundary of the ellipse, one need not require that an ellipse with nonzero size have the same center, orientation, and aspect ratio as an infinitesimally small ellipse. We thus want to look at the quantity z ÿ z 0 J T BJz ÿ z 0 J ÿ 2J; and minimize it to lowest order in J, while simultaneously finding z 0 J and BJ, varying both the transformation A as well as z 0 J and BJ. Note that z 0 0 0 and B0 I, the identity matrix. Substituting z z 1 z 2 , z 0 J z 02 J, and keeping terms to third order, Eq. (31) becomes 2J 3=2 3 2 g 33 ÿ g 31 cos 3 ÿ 3 2 g 30 ÿ g 32 sin 3 ÿ 1 2 3g 33 g 31 cos ÿ The cos 3 and sin 3 terms can be eliminated by the freedom in choosing A (there are 2 free parameters there). One should find an A such that g 32 g 30 and g 31 g 33 . The result is that all the third order terms will be eliminated, and the fourth order terms are eliminated. Note that if the problem did not have the symmetry it does, there would be g 4 terms which would lead to nonzero cos 4 and sin 4 terms, and the fourth order terms could not be removed. Since the parabola is not perfectly symmetric, one might argue that it is important to include those fourth order terms from the broken symmetry. However, the symmetry breaking is small in the FFAG designs under consideration, and for the large phase space we wish to transmit, the magnitude of the coefficients of the fourth order terms may be less than the magnitude of the coefficients of the fifth order terms by more than a factor of the magnitude of the phase space variables. If this is true, the symmetric approximation is still important. This paper will discuss only the symmetric case, and the asymmetric case will be addressed in a subsequent paper.
To determine the magnitude of the ellipse distortion, we will need to determine the change in the square of the radius to fifth order in the phase space variables. This requires computing M C z 1 to fourth order, which is 034001-6 1 6 g 3 ; g 3 ; g 3 ; z 1 g 5 ; z 1 ÿ5g 2 30 g 2 33 g 33 x 4 8g 2 30 ÿ g 2 33 g 30 x 3 p 63g 2 30 ÿ g 2 33 g 33 x 2 p 16g 30 g 2 33 xp 3 3g 2 33 ÿ g 2 30 g 33 p 4 g 2 33 ÿ 3g 2 30 g 30 x 4 ÿ 16g 2 30 g 33 x 3 p 6g 2 30 ÿ 3g 2 33 g 30 x 2 p 2 8g 2 30 ÿ g 2 33 g 33 xp 3 g 2 30 5g 2 33 g 30 p 4 ! ÿg 51 x 4 ÿ 2g 52 x 3 p ÿ 3g 53 x 2 p 2 ÿ 4g 54 xp 3 ÿ 5g 55 p 4 5g 50 x 4 4g 51 x 3 p 3g 52 x 2 p 2 2g 53 xp 3 g 54 p 4 ! ; (37) where I made use of the fact that g 30 g 32 and g 31 g 33 . Including a fourth order term z 40 J 2 in z 0 J, the fifth order terms in Eq. (31) will be r 2 22J 5=2 r 50 cos 5 r 51 cos 4 sin r 52 cos 3 sin 2 r 53 cos 2 sin 3 r 54 cos sin 4 r 55 sin 5 where r 50 g 33 3g 2 30 ÿ g 2 33 ÿ g 51 ; Rewrite this as a sum of terms linear in cos n and sin n, and choose z 04 so as to eliminate the cos and sin terms: ÿg 51 ÿ g 53 ÿ 5g 55 g 54 g 52 5g 50 : Now, finding the maximum value of r 2 is straightforward. Take the derivative of the fifth order terms in r 2 with respect to , divide by cos 5 or sin 5 , and find the zeros of the polynomial in tan or cot. Substitute the resulting values into fifth order terms in r 2 to get maximum value of r 2 . The resulting minimum distortion is shown in Fig. 8, and the b corresponding to that minimum distortion is shown in Fig. 9. The distortion of the radius is proportional to the fourth power of the radius (it was proportional to the square of the radius if the J-dependent shift and ellipse shape change were not included) and for small a, is approximately proportional to a ÿ 1=24 ÿ3 .
Depending on how the ellipse is viewed physically, the nonzero z 0 J may imply that one is not accelerating all the way from p 0 to p 1. In that case, one can adjust a after the calculation to correct for this.

IV. ACCURACY OF THE CALCULATION
Since this calculation only keeps the lowest-order terms relevant to the desired result, there must be an inaccuracy in the estimate of the ellipse distortion and the emittance growth.
For the ellipse boundary distortion without the amplitude-dependent linear transformation, Fig. 10 shows the relative inaccuracy in the calculation. The calculation is done by choosing a value for a, choosing b and an ellipse orientation to give the minimum value for J, tracking 1001 points on the ellipse (spaced equally on a circle then   transformed to the ellipse), calculating the actual value of J, and taking the maximum value. J is calculated from the tracking data based on the expected ellipse (i.e., I did not try to find an ellipse that best fit the tracking data). The calculation was done for a from 0.05 to 0.15, and the result was nearly independent of a, so only a single curve is shown in Fig. 10. Figure 11 shows the result for the same calculation but for the calculation of the ellipse boundary distortion with the amplitude-dependent linear transformation. In this case, the results do depend on a, but the inaccuracies are qualitatively similar for different a. The sharp rise near zero is expected from the fact that J=J / 2J 3=2 , and the relative correction should be proportional to 2J 1=2 (this also explains the linear behavior in Fig. 10). Despite the initial rapid rise, the inaccuracy levels off quickly and stays around 10% or lower (lower for larger a) until the desired inaccuracy gets to around 5%-6%; at that point, the inaccuracy begins to increase very rapidly (less rapidly for larger a). As one is willing to tolerate a larger ellipse distortion, the higher-order terms begin to become more important, until a point where they cause a rapid reduction in the accuracy of the estimate made here.
For emittance growth, the accuracy of the calculation is shown in Fig. 12. The emittance growth is computed by integrating 10 5 particles distributed uniformly in and in J according to the distribution [19] This distribution function leads to a one-dimensional distribution (in either x or p) which has no particles beyond 3. The inaccuracy in the calculation quickly becomes extremely large. This is due to the large-amplitude tails in the distribution, which are more strongly affected by the higher-order terms and contribute disproportionately to the emittance growth, as can be seen in Fig. 13. To reduce this effect, one can apply the following iterative procedure to the final distribution: find the second-order covariance matrix; remove all particles outside of a k ellipse, where k is a given constant; repeat these two steps until no particles are cut. The results of this cutting procedure are shown in Fig. 14. The value of k has an extremely strong effect on the result, and there does not seem to be a clear way to choose the best value for k. The poor accuracy of the emittance growth calculation without the cutting procedure, the lack of a good choice for the cutting amplitude, and the fact that for some distributions, the emittance growth calculation leads to a reduction in the calculated emittance, leads one to conclude that emittance growth  does not seem to be a good criterion for the optimization of a machine design, except possibly when only very small emittance growths are desired (in the range of 1%-2%) and for Gaussian or nearly Gaussian distributions.

V. OTHER CRITERIA FOR OPTIMIZATION
In real machines, there may be criteria other than just the longitudinal distortion that determine the optimal parameters to use.
For example, for small values of a, the minimum longitudinal distortion occurs when b takes on its minimum value, 1=3 ÿ 2a. One can compute the change in u (and thus the total arc length covered) in accelerating from p 0 to p 1 as a function of a and b; it is the integral Z 1 0 6 dp 36a 2 ÿ 2p ÿ 1 2 2p ÿ 1 2 ÿ 3b 2 p : This integral can be done in terms of an elliptic integral; it is where F is the incomplete Jacobi elliptic integral of the first kind, and (50) Note that in the equation for q 0 , one should take the real solution of the cube root. A more restricted version of this result appears in [20]. The change in u is shown in Fig. 15 as a function of b for various a. It is clear from that figure that the smallest change in u, and therefore the smallest time in the FFAG, does not occur when b is at its minimum value. For muon machines, where muon decay must be minimized, or for machines where the machine's repetition rate must be maximized, one may not want to use the minimum b, even though that does give the smallest longitudinal distortion for a given a. Using the machine parameters referred to in Sec. II, the decay loss at the minimum b is  Instead, one may choose a different figure of merit, which combines the longitudinal distortion with some other quantity. For instance, one could combine the decay loss with the longitudinal distortion by imagining that any area outside of the ellipse is loss. If there is only one harmonic in in the ellipse distortion, and higher-order terms in J=2J are ignored, then the fractional loss can be written as For this merit factor, for a 10 -20 GeV muon FFAG, 201.25 MHz rf with an average gradient of 1:5 MV=m, and a normalized longitudinal acceptance (transmitted ellipse area divided by ) of 150 mm, Fig. 16 shows the b which minimizes the above fractional loss, and Fig. 17 shows the minimum loss as a function of a. I use the ellipse distortion with the amplitude-dependent shift of the center to compute this. For sufficiently small a, the minimum b is still optimal, since the distortion ''losses'' far exceed the decay losses. As a gets larger, however, the decay losses become more important, and the optimal b is no longer the minimum, as can be seen from Fig. 15.

VI. CONCLUSIONS
I have described a method for quantifying the longitudinal distortion of a phase space ellipse in a dynamical system. A Dragt-Finn expansion is used to facilitate the lowest-order minimization of the distortion. The method is applied to linear nonscaling FFAGs, where longitudinal phase space distortion is an important design consideration. It yields a procedure for choosing important FFAG design parameters. Using emittance growth to characterize the distortion in this system appears to be problematic, whereas distortion of an ellipse seems to work well.
Finally, this method has broader applicability to singlepass systems, by changing the similarity transformation on the map by A with separate initial and final transformations.