Beam Dynamics in High Intensity Cyclotrons Including Neighboring Bunch Effects: Model, Implementation and Application

Space charge effects, being one of the most significant collective effects, play an important role in high intensity cyclotrons. However, for cyclotrons with small turn separation, other existing effects are of equal importance. Interactions of radially neighboring bunches are also present, but their combined effects has not yet been investigated in any great detail. In this paper, a new particle in cell based self-consistent numerical simulation model is presented for the first time. The model covers neighboring bunch effects and is implemented in the three-dimensional object-oriented parallel code OPAL-cycl, a flavor of the OPAL framework. We discuss this model together with its implementation and validation. Simulation results are presented from the PSI 590 MeV Ring Cyclotron in the context of the ongoing high intensity upgrade program, which aims to provide a beam power of 1.8 MW (CW) at the target destination.

1 million), even on current state-of-the-art supercomputers. In contrary, in P-M methods the fields are calculated on the discrete domains. Due to its high efficiency and high precision, P-M based Particle-In-Cell (PIC) methods [13] are widely used in parallel macro-particles simulation codes for different types of accelerators and beam lines [6,[14][15][16][17] as well as many other areas of computational science, thanks to the development of parallel computation technology during recent years. The Parallel PIC model is the method of choice in this study on the beam dynamics of high intensity cyclotrons.
For high intensity cyclotrons, single bunch space charge effects are not the only contribution. Along with the steady increase of beam current, the mutual interaction of neighboring bunches in radial direction becomes more and more important, especially at large radii where the distances between neighboring bunches diminishes, and even the overlap can occur. One example is the PSI 590 MeV Ring Cyclotron [18] with a production current of 2 mA in CW operation and a beam power on target of approximately 1.2 MW. An ambitious upgrade program for the PSI Ring Cyclotron is in progress, aimed for 1.8 MW CW beam power on target. The concept involves replacing four aluminum cavities by new copper cavities with peak voltages increasing from about 0.7 MV to above 0.9 MV, meanwhile the old flat-top cavity remains in use with peak voltage standing at 11.2% of the main voltage. After the planned upgrade is finished, the total turn number can be significantly reduced, e.g. from more than 220 turns to less than 170 turns.
The turn separation is consequently increased as shown in Fig. 1, but remains at the same order of magnitude as the measured radial bunch size (at the 1σ level) and is also dependent on the increasing bunch current. Therefore, when the beam current increases from 2 mA to 3 mA, the correct treatment of space charge effects is of great importance.
This includes the mutual space charge effects between radially neighboring bunches.
Another example is the 100 MeV compact cyclotron (CYCIAE-100) under construction at CIAE [19]. Although its beam current is only 200 µA to 500 µA, because of the small energy gain per turn (about 200 keV), the turn separation is far more smaller than the beam size at outer radii (at the extraction, ∆R = 1.5 mm) and multiple bunches will overlap.
Because of the complexity of the problem, it is impossible to evaluate neighboring bunch effects precisely and self-consistently by explicit analytic expressions. However, high performance computation (HPC) makes it possible to treat this problem in greater detail. To our knowledge, little research has been undertaken in neighboring bunch effects, and the only published work in that regard was by E. Pozdeyev [7]. He introduced "rigid auxiliary bunches" in his serial code CYCO which uses the azimuthal angle as the independent variable.
In Section II, a new PIC based self-consistent numerical simulation model is presented which, for the first time, covers neighboring bunch effects. Section III describes our three-dimensional object-oriented parallel simulation code OPAL-cycl, a flavor of the OPAL framework. The results of performance testing and code benchmarking are presented in Section IV, and followed by its applications on the PSI 590 MeV Ring Cyclotron in Section V. The last section is devoted to teh conclusion of the paper.

A. PIC MODEL IN CYCLOTRON
In the cyclotrons and beam lines under consideration, the collision between particles can be neglected because the typical bunch densities are low. In time domain, the general equations of motion of charged particle in electromagnetic where m 0 , q, γ are rest mass, charge and the relativistic factor. With p = m 0 cγβ we denote the momentum of a particle, c is the speed of light, and β = (β x , β y , β z ) is the normalized velocity vector. In general the time (t) and position (x) dependent electric and magnetic vector fields are written in abbreviated form as B and E.
If p is normalized by m 0 c, Eq. (1) can be written in Cartesian coordinates as The evolution of the beam's distribution function f (x, cβ, t) can be expressed by a collisionless Vlasov equation: where E and B include both external applied fields, space charge fields and other collective effects such as wake fields In order to model a cyclotron, the external electromagnetic fields are given by measurement or by numerical calculations.
The space charge fields can be obtained by a quasi-static approximation. In this approach, the relative motion of the particles is non-relativistic in the beam rest frame, so the self-induced magnetic field is practically absent and the electric field can be computed by solving Poisson's equation where φ and ρ are the electrostatic potential and the spatial charge density in the beam rest frame. The electric field can then be calculated by and back transformed to yield both the electric and the magnetic fields, in the lab frame, required in Eq. (4)  The combination of Eq. (3) and Eq. (5) constitutes the Vlasov-Poisson system. In the content followed, the method of how to solve these equations in cyclotrons using PIC methods is described in detail.
Considering that particles propagates spirally outwards in cyclotrons, and the longitudinal orientation changes continuously, three right-handed Cartesian coordinate systems are defined, as shown in Fig. 2. The first coordinate system is the fixed laboratory frame S lab , in which the external field data is stored and the particles are tracked. Its origin is the center of the cyclotron and its X − Y plane is the median plane and the positive direction of Z axis points to vertical direction. The second coordinate system is the local instantaneous frame S local , which is a temporal auxiliary frame for the space charge solver. Its origin O is the mass center of the beam and the orientation of the Y axis is coincident with the average longitudinal direction and the positive orientation of the Z axis points to the vertical direction.
The third coordinate system is the beam rest frame S beam , which is co-moving with the centroid of the beam. It has the same orientation and origin as S local , but the length in longitudinal direction is scaled by 1/γ due to relativistic effects.
At each time step, in order to seek a solution for the space charge fields, the frames S local and S beam are redefined according to current 6D phase space distribution, and all particles are transformed from S lab to S local , then a Lorentz S beam . In a 3D Cartesian frame, the solution of the Poisson equation at point (x, y, z) can be expressed by with G the 3D Green function G(x, x , y, y , z, z ) = 1 assuming open boundary conditions. The typical steps of calculating space charge fields using the Hockney's FFT algorithm [13] is sketched in Algorithm 1, where the quantities with superscript D (discrete) refer to grid quantities.
With respect to the external magnetic field two possible situations can be considered: in the first situation, the real field map is available on the median plane of the existing cyclotron machine using measurement equipment. In most cases concerning cyclotrons, the vertical field, B z , is measured on the median plane (z = 0) only. Since the magnetic field outside the median plane is required to compute trajectories with z = 0, the field needs to be expanded in the Z direction. According to the approach given by Gordon and Taivassalo [20], by using a magnetic potential and measured B z on the median plane at the point (r, θ, z) in cylindrical polar coordinates, the 3rd order field can be written as Algorithm 1 3D Space Charge Calculation where B z ≡ B z (r, θ, 0) and All the partial differential coefficients are computed on the median plane data by interpolation, using Lagrange's 5-point formula.
In the other situation, 3D field for the region of interest is calculated numerically by building a 3D model using commercial software during the design phase of a new cyclotron. In this case the calculated field will be more accurate, especially at large distances from the median plane i.e. a full 3D field map can be calculated. For all calculations in this paper, we use the method by Gordon and Taivassalo [20].
Finally both the external fields and space charge fields are used to track particles for one time step using a 4th order Runge-Kutta (RK) integrator, in which the fields are evaluated for four times in each time step. Space charge fields are assumed to be constant during one time step, because their variation is typically much slower than that of external fields.
The code is intended to model steady state conditions for the multi-bunch beam dynamics. In cyclotrons the pattern of turn separation ∆R is affected by many factors. These include machine characteristics such as the magnetic field, the acceleration voltage profile, the accelerating phase of the RF resonators and initial centering conditions of the injected bunches. Generally, in cyclotrons, ∆R reduces gradually with increasing beam energy. For machines like the PSI Injector II, ∆R stays sufficiently large from injection to extraction, and in such cases, neighboring bunch effects are negligible. For others, like the PSI 590 MeV Ring Cyclotron under consideration in this section, ∆R decreases strongly during the course of acceleration, which results in the need of considering the neighboring bunch effects in order to obtain a precise description of the beam dynamics. In our model, we apply an iterative procedure to determine the number of bunches necessary for a converged simulation. Initially a single bunch with phase space density f 0 and the average radial position R s is injected. This single bunch is tracked with space charge for one revolution period T r . Then the new average radial position of the bunch R e and the bunch rms size r rms = x 2 rms + y 2 rms are calculated from the actual particle distribution. The turn separation ∆R at injection position is then given by ∆R = R e − R s .
If the condition: is fulfilled (where M is a parameter given by the user), the 6D phase space is stored as f Re . The code is switched to multi-bunch mode, and f Re will be used as the initial phase space for the following (N B − 1) neighbouring bunches which will be injected one by one per T r time, where N B is the number of neighboring bunches given by the user.
If the condition of Eq. (11) is not fulfilled, the value of R e is assigned to the variable R s . This single bunch is tracked with space charge for another revolution period T r , and the new average radial position of the bunch R e , the bunch rms size r rms and the turn separation ∆R are calculated again. After that, the condition in Eq.(11) is evaluated and the same procedure is repeated accordingly.
The underlying assumption for this Ansatz is that all bunches have the same phase space distribution when they reach a certain position, i.e. when f Re is saved. This is realistic and reasonable when the machine is running in a steady state. It need be mentioned that up to that position the coherent instabilities which might be caused by neighboring bunch effects, are not covered by this model.
This procedure is summarized in Algorithm 2.
In the multi-bunch algorithm above, two parameters M and N B are introduced to set the time of injecting new  bunches and the total bunch number respectively. The proper settings of these two parameters are crucial for the precise evaluation of neighboring bunches effects.
In order to quantify the range of the two parameter N B and M , let's consider a 2D non-relativistic DC beam. The Bassetti-Erskine [21] formula for the electric field of a 2D Gaussian charge distribution is in general an analytical expression in terms of the complex error function. In case of an axisymmetric and Gaussian charge distribution the electric field can be expressed by n r , r ≤ a n r , r > a. In theory, when the maximum bunch number N B equals to the total turn number of the machine, one can eventually obtain the fully self-consistent solution of the problem within our model. In reality, it is not feasible to simulate a full set of bunches, which typically range from several ten to several hundred. The scale of the number of particles and the dimensions of the needed grid are still beyond the capability of today's supercomputer resources.
In a multi-bunch simulation the energy of the bunches at different turns can be substantially different. For a multi-bunch simulation with 9 neighbouring bunches, if the kinetic energy of the first bunch is E k = 100 MeV and energy gain per turn is ∆E k = 2 MeV, hence the velocity difference of the first and last bunch is 6.5%. In this case there is no single rest frame in which the relative motions of particles are non-relativistic, as required by our scheme to calculate the space charge forces. Consequently it is not sufficient to use only one rest frame and one single Lorentz transformation. In order to calculate the space charge fields more precisely, we use an adaptive binning technique outlined in Ref. [22] (section IV). We note that in our application neither radiation nor retardation effects play a significant role and can therefore be neglected. In the rest frame of the beam, transverse currents effects can also be neglected and hence no longitudinal magnetic field component must be considered.
First we create the same amount of energy bins (N B ) as we have bunches in the simulation. An average relativistic factorγ i for the i th bunch with N i p simulation particles is computed, Then every particle is grouped into the energy bin whoseγ i is closest to its γ. In this way, the energy spread of each bin is small and the relative motions of the particles in the same bin are small. After binning we perform the Lorentz transformation, calculate the space charge field and perform back-transformation for each bin respectively. Finally the field data is summed up to give the total space charge force imposed on each particle.
The energy spread of the bunches can be large (MeV range), especially in cyclotrons without flat-top cavities, and at large radii. This may result in an overlap of energy distributions of neighbouring bunches, and hence the energy bins have to be recalculated i.e. all particles need to be regrouped after each revolution. It is worth noting that, in cyclotrons, the energy difference of neighboring bunches changes with the increasing radius. Therefore the energy difference of the neighboring bins is not constant. Specifically, the energy difference between the i th and (i − 1) th bins, ∆Ē i,i−1 , differs with the energy difference between the (i + 1) th and the i th bins ∆Ē i+1,i .

III. IMPLEMENTATION WITHIN THE OPAL FRAMEWORK
The above model and algorithm are implemented in the object-oriented parallel PIC code OPAL-cycl. OPALcycl is one of the flavors of the OPAL (Object Oriented Parallel Accelerator Library) framework [23]. This framework is a powerful tool for charged-particle optics in general accelerator structures and beam lines using the MAD languages with extensions. OPAL is based on the CLASSIC [24] library and the IP 2 L framework [25]. The CLASSIC library is a C++ class library which provides services for building portable accelerator models and algorithms and inputing language to specify complicated accelerator systems in general. IP 2 L is an Object-Oriented C++ class library which provides abstractions for particles and structured field calculation in a data-parallel programming style. It provides an integrated, layered system of objects. The upper layers contain global data objects of physical/mathematical quantities, such as particles, fields and matrices of meshes and typical methods performed on these objects such as differential operators and multi dimensional FFT's. The lower layers contain the objects relevant to parallelization such as data distribution, domain decomposition, communication among processors, load balancing and expression templates. Statistical data, such as root mean square (rms) quantities, are computed on the fly (in situ) and stored in conjunction with phase space and field data in the H5Part [26] file format. In a post processing step the data can be analyzed using the visualization tool H5PartROOT [27].
In addition, apart from the multi-particle simulation mode, OPAL-cycl also has two other serial tracking modes for conventional cyclotron machine design. One mode is the single particle tracking mode, which is a useful tool for the preliminary design of a new cyclotron. It allows to compute basic parameters, such as reference orbit, phase shift history, stable region and matching phase ellipse. The other one is the tune calculation mode, which can be used to compute the betatron oscillation frequency ν r and ν z . This is useful for evaluating the focusing characteristics of a given magnetic field map.
A more detailed description of the hierarchical layout, the parallelization and the implementation issues of the OPAL framework and OPAL-cycl code can be found in the User's Reference Guide [23].

IV. PERFORMANCE TEST AND VALIDATION
In order to evaluate the performance and to benchmark the functionalities of the newly developed code, we performed different types of simulations on the 72 MeV Injector II cyclotron of PSI, which has been intensively studied before.
Some results are presented in this section.

A. Single particle tracking and tune calculation
In theory, there is an eigen-ellipse for any given energy in a cyclotron under stable conditions. When the initial phase space matches this eigen-ellipse, the oscillation of the beam envelope amplitude will be minimal and the transmission efficiency will be maximal. In the present test, the eigen-ellipse at 2 MeV kinetic energy is calculated using the single particle tracking mode of OPAL-cycl. The result is compared to FIXPO [28,29]. At PSI the FIXPO code has been the standard simulation tool for the design and parameter optimisation of the Injector II and the Ring Cyclotron as well as for a gas driven muon trap in a cyclotron shaped magnetic field. The code integrates single particle orbits by use of a predictor corrector algorithm up to the third order. In Fig. 5 the matched radial ellipse with an initial offset of x = 2.0 mm, p x = 0.0 mrad at the symmetry line of the sector field is shown. Excellent agreement is obtained when the time step is set to 1 ps in OPAL-cycl, although FIXPO uses a different tracking algorithm with the azimuthal angle as the independent variable.
The tune diagram of Injector II is computed using the tune calculation mode of OPAL-cycl, as shown in Fig. 6.
The result from FIXPO and OPAL-cycl are again in good agreement even though different numerical algorithms are used.
The field interpolation scheme, particle tracking and tuning calculation functionalities are validated substantially by the above tests. We can see that good scalability is achieved up to 128 processors. Above 128 processors, the time consumption of the phase space dumping starts to become significant. The reason for the behavior is the increasing overhead in communication with respect to the amount of data to be stored Nevertheless, the scalability of the space charge solver and the particles integrator still benefit from a large number of processors.

C. Stationary Round Distribution in the PSI Injector II
Space charge effects usually result in an increase in beam size and emittance. That is detrimental to beam dynamics.
However there are cases, where space charge effects can actually play a positive role. The PSI Injector II cyclotron is a space charge dominated machine, in which a very compact stable beam is developed within the first several turns, and thereafter, the charge distribution does not change significantly. This stationary situation remains essentially unchanged until extraction and the beam phase width is about 2 • in the last turn. This is due to the combined effect of the strong coupling between the radial and longitudinal directions in the cyclotron and the space charge when the beam current increases above 1 mA. S. Koscielniak and S. Adam reproduced this phenomenon by using the serial two-dimensional code PICN [4]. PICN is based on the Needle model, which treats the beam as an ensemble of charged vertical needles with the same height as the beam. In this model, the vertical motion of particles is separated from the horizontal component and the internal motion within needles is neglected. In order to validate the space charge solver module of our code, we performed the 1 mA, 3 MeV coasting beam simulation on this machine and compared the result with PICN.
We used the same initial distribution as in Ref. 30. 2σ longitudinal = 13.4 mm ( 15 • phase width), 2σ transverse = 2.52 mm. The initial emittances of the radial and azimuthal directions are set to zero. In the vertical direction, 2σ z = 4 mm and 2σ pz = 3.68 mrad. The total macro-particles number is 1 million. Figure 8 shows the top view of beam shape in the local beam frame. We can see that a stable core is developed after about 10 turns, which is faster than the formation of stable haloes.
When comparing Fig. 8 with Fig. 3 and Fig. 4 in Ref. 30 calculated by PICN, we can find the results agree with each other qualitatively. Both of these two codes predict the formation of a compact stable core and wide haloes after 40 turns. However, there still exist visible differences. PICN shows an almost "round" charge density distribution, in the case of OPAL-cycl we still see low density halo. The longitudinal size of the core is about 2 mm longer than the transverse size. This difference mainly comes from different physical models used in the two codes. PICN uses a so called smooth approximation which treats the particle orbits as pure circles, but with a sinusoidal radial/vertical focusing(betatran oscillation) having a realistic value of about 1.14 at 3 MeV. Of course, no realistic magnetic field can be azimuthally constant and in the same time focusing in both directions [31]. In addition all particles in close proximity to the horizontal plane are represented by a single "needle"; in OPAL-cycl only those particles close to each other in configuration space are represented by a single macro-particle. The latter is believed to be more realistic and accurate.
It can be concluded from this comparison that OPAL-cycl can reliably reproduce the stable "round" beam formation caused by space-charge effects in Injector II and hence can accurately reproduce the single bunch dynamics

V. APPLICATIONS
We start this section by describing the two cyclotrons under consideration. The key parameters of the machines are given in Table I. In addition we mention the fundamental RF frequency is 50.633 MHz. In the Injector II due to the circular bunch formation, discussed in section IV, the original designed third harmonic flat-top resonator is now being used as an additional accelerating structure. It is now obvious to ask the question if one can find a feasible working point for the Ring Cyclotron with the same characteristics as obtained in the Injector II. However because of the overlapping tuns in the Ring Cyclotron, the situation is much more complex than in the Injector II. Those two issues are addressed for the first time in the remainder of this paper.   the beam does not expand notably in the radial direction, which means no substantial increase of the beam loss on the extraction septum is expected for bunches with initial phase width less than 10 • .

B. Neighboring bunch effects in the PSI Ring
As discussed in Section I, neighboring bunch effects may have an appreciable influence on beam dynamics in the Ring. This can be evaluated by comparing the difference in single bunch and multiple bunch simulations as described in Section III. We have run simulations for 3, 5, 7 and 9 bunches and found that the difference between the 7 bunch scenario and that of 9 bunch scenarios is small, i.e. is viewed as converged, as illustrated in Fig. 12 and discussed already in Section II B. From Fig. 13 we conclude that the FWHM of the transverse profile is reduced by approximately 33% comparing a 1 and 9 bunch simulation. For the energy spread (FWHM) we have a reduction in the order of 14% i.e from 0.7 MeV to 0.6 MeV.
As Gordon explained in [10] the particle motion in a cyclotron is always perpendicular to the force resulting in a vortex motion. In order to obtain the observed sharpening of the distribution we need an additional, azimuthal force.
A possible explanation of the origin of this force is due to the observed broken circular symmetry when considering neighboring bunches in the simulation. However more efforts are needed in order to understand this effect in greater detail.
From the comparison we conclude that the integration of neighboring bunch effects into the model has non-negligible impacts on the beam dynamics for beam currents beyond 1 mA in the PSI Ring. The bunch becomes more compact in the transverse direction and the energy spread is slightly reduced. Therefore neighboring bunch effects have a positive influence on reducing beam loss in high intensity operation.

VI. CONCLUSIONS AND DISCUSSIONS
A physical model for the beam dynamics in high intensity cyclotrons, which includes for the first time the space charge effects of neighboring bunches, is presented in this paper. This model is implemented in an object-oriented three-dimensional parallel PIC code (OPAL-cycl), as a flavor of the OPAL framework.
The performance tests on the CRAY XT3, CSCS demonstrate a good scalability of OPAL-cycl with respect to the number of used processors. The three operation modes of this code (tune calculation, single and multiple particle mode) are validated by code comparison. OPAL-cycl has been successfully applied to study the behavior of the PSI Ring Cyclotron at high intensities.
The beam intensity of this high power facility is practically limited by uncontrolled losses at the extraction element of the cyclotron, originating from beam tails that are developed during the acceleration process. As shown in the results, the generation of beam tails can be avoided if short bunches with a phase length of 2 • or less are injected. An upgrade plan is under way to generate such short bunches with the help of a 10th harmonic buncher. Furthermore it is observed that the neighboring bunch effects can help to narrow the transverse beam size and reduce the energy spread. The differences between single and multi-bunch simulations are in the order of 33% and 15% in the compared quantities, beam size and energy spread respectively. This is a significant difference between single and multi-bunch simulation, and hence justifies the presented simulation method and their application. This is an important step towards the quantitative understanding of beam tails in high power cyclotrons. Considering the fact that in the PSI Ring Cyclotron the total sum of controlled and uncontrolled losses are in the order of 10 −4 , a precise beam dynamics simulation must cover radial neighboring tuns in order to predict losses with the mentioned intensities.
It is planned to refine these simulations within the next few years by a more detailed determination of the initial particle distribution at the injection of the PSI Ring Cyclotron. A quantitative comparison of the results with measured beam quantities will be presented in a future paper.

VII. ACKNOWLEDGMENTS
The authors would like to extend sincere thanks the Accelerator Modeling and Advanced Computing group members C. Kraus, Y. Ineichen and B. Oswald for many useful discussions regarding programming and T. Schietinger for providing the post-processing tool H5PartRoot. We also feel indebted to W. Joho, S. Adam and R. Dölling for many helpful discussions regarding high intensity beam dynamics in cyclotrons. This work was performed on the Merlin3 cluster at the Paul Scherrer Institut and on the Cray XT3 at Swiss National Supercomputing Center (CSCS) within the "Horizon" collaboration. One author (J. J. Yang) was partially supported by Natural Science Foundation of China (10775185) during his sabbatical at PSI. The authors would also like to thank the referees for a number of suggestions that have substantially improved the form of this work.