Toolkit for simulated commissioning of storage-ring light sources and application to the advanced light source upgrade accumulator

We present a new accelerator toolbox (AT)-based toolkit for simulating the commissioning of light-source storage rings. The toolkit provides a framework for supporting high-level scripts to represent with realism the various procedures (e.g., orbit and optics correction, beam-based alignment, etc.) encountered during commissioning and is designed to mirror as closely as possible the reality as seen from the control room. Emphasis is placed on the inclusion of a comprehensive set of error sources and faithful modeling of beam diagnostics. The toolkit capabilities are demonstrated in an application to the recent design and commissioning studies of the Advanced Light Source Upgrade (ALS-U) Accumulator Ring, a short-time successful commissioning of which will be critical to the overall ALS-U project success.


I. INTRODUCTION
To achieve small beam emittance, diffraction-limited light sources employ lattice designs based on high-gradient and small-aperture focussing elements, which lead to larger natural chromaticities, stronger chromatic sextupoles, and ultimately highly nonlinear lattices [1,2]. A consequence of the combined strong nonlinearities and focusing is an enhanced sensitivity to magnet and other lattice errors.
This places emphasis on the need for realistic modeling of the relevant errors, the development of efficient beam orbit/optics correction schemes, and high fidelity simulations of the actual procedures used for correction, with the goal to establish feasible error tolerance specifications and ensure rapid commissioning. As many of the newgeneration light-source projects are upgrades of existing facilities, meeting the latter goal is essential to minimize the dark time [3].
The new machines challenge the traditional view that tends to represent commissioning as somewhat disjoint from the design phase and to be pursued by following a more empirical, hands-on approach. The emerging consensus is that commissioning simulations are integral to the design effort and should inform the design process from the start [4][5][6][7][8][9]. This is particularly true in the case of the Advanced Light Source Upgrade (ALS-U) [10], where the design challenges common to all new-generation light sources are magnified by the tight space constraints and applies to varying degree to both the storage ring (SR), the actual light-source, and the accumulator ring (AR), a SR-size machine required for swap-out injection.
In this paper we report in detail on the recent development of a new numerical tool addressing the 4th-generation machines needs and its application to the ALS-U AR design. The new tool, the Toolkit for Simulated Commissioning (SC), is an extension to the MATLAB®-based [11] Accelerator Toolbox (AT) [12]. It has been designed with the primary goal of conducting realistic commissioning simulations of electron storage-rings including a large variety of error sources as well as accurately treating the beam diagnostics, within a framework that tries to reproduce as closely as possible the point of view of the machine operator.
SC is well suited for tasks like testing orbit/lattice correction strategies or defining effective commissioning procedures, but it is also a valuable instrument through the entire design process to vet lattice designs or assist with the specification of error tolerances and diagnostics requirements. An earlier version was briefly introduced in [13] and preliminary results have appeared in [14,15].
While the new tool can be expected to display its full potential in the application to the SR, the focus in this paper is kept on the AR in part because of its more advanced level of maturity within the ALS-U Project and in part because it represents an interesting test bed in its own right.
The AR has essentially the characteristics of a 3rdgeneration machine but some aspects are reminiscent of the newer light sources. Most notable, given the relatively large emittance of the beam injected from the booster, is the requirement of small magnet apertures, intended not for field maximization (as in a diffraction limited light sources) but for magnet size and power consumption minimization to permit installation and operation in the ALS tunnel. Moreover, with the installation and early commissioning planned to be concurrent to normal ALS user operations (before the ALS eventual replacement with the ALS-U SR), rapid start-up and commissioning will be crucial to the overall project success.
The outline of the paper is as follows. In Sec. II we give an overview of the design of the toolkit and some of its central capabilities. The source code, including application examples is available online [16] and detailed descriptions of all functions and their usage can be found in the manual. Section III is intended to give an overview of the ALS-U AR as well as a detailed description of the machine layout and relevant error sources. In Sec. IV we apply the SC toolkit to the ALS-U AR while performing a start-to-finish commissioning simulation study including the transfer line from the booster.

II. TOOLKIT DESIGN AND USAGE
Realistic simulations of the operation of a complex machine like an accelerator require not only a good model of the beam dynamics but also the recognition that incomplete information about the actual machine state is available during operation, due to the many unknowns in the machine geometry, magnetic fields, and beam-diagnostic systems.
In this spirit, the SC toolkit makes a clear distinction between machine parameters that are accessible during operation on the one hand (e.g., a magnet set-point or a BPM reading) and the parameters that go into the beam dynamics simulations (e.g., the field coefficients entering the symplectic integrator through a lattice element) and their results (e.g., actual beam offsets) on the other. This provides a framework for supporting high-level scripts that simulate with realism the various procedures (e.g., rf commissioning) to be encountered during commissioning, closely mirroring the reality as seen from the control room. The logic of this approach is captured by the workflow schematically shown in Fig. 1. Typical usage of the SC toolkit proceeds through the following steps, described more in detail below: (1) Initialization of the SC core structure (2) Error source definition & registration (3) Generation of a machine realization including errors (4) Interaction with the machine.
Initialization.-In a first step, the user initializes the toolkit by calling SCinit() with the AT lattice of their machine as input. This sets up a MATLAB®-structure with which nearly all subsequent functions of the toolkit interact. All relevant information about the machine and error sources is stored within this central structure. Having a single, isolated core data-structure allows for the state of the toolkit to be easily saved and loaded at user's discretion.
Error source definition and registration.-In the next step, the user registers elements like magnets, BPMs or cavities, including all error sources they would like to consider in the SC structure, using the SCregister*() function family. The SCregister*() functions, e.g., SCregisterBPM() for BPMs, typically take the ordinates of the elements in the lattice and values for the uncertainties for any of the parameters used by the AT tracking code, as well as some parameters specific to SC's error model as input. Further, these functions are used to specify advanced properties of the elements which are subsequently accounted for by the toolkit; for instance the user here specifies which magnets are "split"or combined-function magnets, or which magnets should be used as a dipole or skew quadrupole corrector including their limits, etc. The function SCplotLattice() visualizes the lattice properties including the implementation of magnets and diagnostic devices, see Fig. 4 for an example. The function SCsanityCheck() helps identifying unreasonable registration of elements.
Generation of a machine realization.-Errors are randomly generated based on the uncertainties stored in SC.SIG and applied to the lattice via SCapplyErrors(). Typically, errors are modeled to follow a 2σ-truncated Gaussian distribution, where σ is the value specified by the SCregister*() functions. Multiple calls to SCapplyErrors() produce a family of lattice realizations following the same error distribution, allowing to comfortably set up Monte-Carlo tolerance studies. Note that a Gaussian distribution truncated at 2σ has not anymore the same σ of the original distribution.
Interaction with the machine.-Incomplete information about the state of the machine is mediated by the function SCgetBPMreading() and the function family SCset* 2SetPoints(). For example, SCgetBPMreading() models the reading of the BPMs previously defined by SCregisterBPMs(), taking into account injection errors, BPM offsets, BPM calibration errors, and more as described below. Members of the SCset*2SetPoints() family model the process of assigning the set point of an experimentally accessible variable in the control system of the accelerator, for instance the strength of a quadrupole magnet. Based on these setpoints the actual simulation parameters going into the AT tracking routine are calculated by a subsequent call to, for instance, the SCupdate Magnets() function. This mechanism provides a powerful layer of abstraction, which allows one to easily extend and modify the underlying error models during further development of the toolkit without the need to modify any existing user-side code. All commissioning routines implemented in SC exclusively use these functions to interact with the machine, so that the commissioning simulation is conducted from the point of view of operation.
In the following subsections the details of the error model and the correction routines are outlined.

A. Error models
Magnets.-AT features symplectic integrators that allow to track particles through magnetic fields of arbitrary multipole order. In our framework the multipole coefficients-in AT-terminology denoted by PolynomA and PolynomBused in these tracking routines are calculated by SCupdate Magnets() considering current setpoints ⃗ b SP , multiplicative calibration errors ⃗ δ cal and additive field offsets The field offset may include a bending angle error of a pure dipole magnet or can be used to specify (higher order) multipole errors, see SCsetMultipoles().
As most other particle tracking codes, AT's tracking routines work with reference to a coordinate system defined by the design trajectory. The design trajectory is solely determined by the nominal length and bending angle of the lattice elements and, therefore, for a circular machine AT yields meaningful tracking results only if the design orbit is closed. This condition has to be ensured when designing the lattice file and must not be violated by efforts to model the effects of a roll or strength error of a dipole magnet. These effects have to be accounted for by means of the PolynomA/B parameters.
For example, if the magnet of interest is registered as a combined function dipole magnet, the actual bending angle is dependent on the quadrupole field gradient. Thus, if the quadrupole field differs from the design value, e.g., because of strength errors or a set point variation, the corresponding horizontal dipole field is added to the PolynomB term. Similarly, the field variations induced by a rotation of the magnet around the beam axis are calculated and applied on the PolynomA/B fields.
Cavities.-Similar to the magnet error model multiplicative and/or additive errors can be assigned to the rf cavity voltage, frequency and phase. Note that the function to switch off/on radiation and cavities, SCcronoff(), sets the cavity pass method to RFCavityPass() to ensure proper handling of frequency deviations from the design frequency.
Injection.-The injected beam errors include a random shot-to-shot variation as well as a static offset from the 6D design injection-trajectory. Parameters of the injection pattern are centrally stored in SC and include the number of turns, particles per bunch, number of injections over which the BPM reading is averaged as well as the injected beam trajectory, the 6D beam σ-matrix and the choice of the tracking mode as described below.
• turn-by-turn mode: A bunch is tracked for the specified number of turns and the readings of the BPMs in each individual turn are returned. • pseudo-orbit mode: The BPM readings are averaged over the turns, giving an estimate of the orbit, without having actually achieved stored beam. • orbit mode: It is assumed that stored beam has been achieved, so that the AT function findorbit6() can be used to determine the orbit, which is then used to calculate the BPM readings. Diagnostics.-SC allows to calculate realistic BPM readings, taking into account a multitude of diagnostic errors. When SCgetBPMreading() is called, particle trajectories are calculated according to the tracking mode. The calculation of BPM readings from particle trajectories takes into account BPM offsets, calibration errors, rolls, and BPM noise. If more than one particle is used for tracking, the BPM reading returns a beam loss if more than a user defined fraction of the particles are lost. The toolkit can be configured to plot the trajectory of every injected beam, which is a valuable utility to discover potential problems of the commissioning procedures. An example of the 1-turn output for the ALS-U Accumulator Ring during early commissioning is shown in Fig. 11. Support structures and misalignments.-The transverse misalignment model was developed to reflect the magnet support structure of the ALS-U facility and includes the concepts of girders, plinths, and sections as illustrated in Fig. 2. Girders may have offset and roll errors, while entire sections and plinths (the concrete slabs on which girders are mounted) are currently considered to have offset errors only.
The girder misalignment is a stack-up of the misalignments of the corresponding section, the plinths and the misalignment of the girder itself. By default it is assumed that the magnets and BPMs are mounted on girders. Their misalignment is therefore a sum of their individual misalignment and the misalignment of the girder at the element location. This feature can be switched off so that only random misalignment of the elements are considered.
The actual misalignment distribution can be plotted using SCplotSupport(), see Fig. 6 for an example. Individual longitudinal misalignments are not considered. However, a global circumference error is modeled by a scaling all drift spaces such that the sections between two dipoles are scaled by the same amount.

B. Correction routines
Diagnostic high-level scripts include, among others, the simulated measurement of the response matrix and dispersion based on the actual injection scheme. Further, different functions have been implemented to determine various performance parameters of the lattice, such as the turn-byturn beam transmission, dynamic-and momentum aperture as well as the beam life time. Based on these functions and observables a variety of correction scripts for the simulation of the commissioning process are implemented.
For the initial trajectory correction the toolkit implements an iterative approach using a pseudo-inversion of the machine's trajectory response matrix, see Sec. IV B for an application or Appendixes A and B for more details on the Tikhonov-regularized version of the SVD pseudoinversion. Based on this method, the function SCfeed backFirstTurn() can bring the machine from its uncorrected state to a state of full one-turn transmission. Subsequently, SCfeedbackStitch() achieves full two-turn transmission and SCfeedbackBalance() finally corrects the machine to a state with a period-one orbit, from which full transmission through a large number of turns can be expected. A final minimization of the BPM readings is achieved by the more generalized function SCfeedbackRun(), which also works in orbit mode and may include the dispersion-with the rf-frequency as an adjustable parameter.
A simple but robust way to perform a coarse trajectory based linear optics correction in early commissioning is a tune scan SCtuneScan(); two quadrupole families are exercised on a grid of setpoints until the beam transmission has reached the target value. rf frequency and phase can be corrected using SCsynchEnergyCorrection() and SCsynchPhaseCorrection(). In both functions the horizontal turn-by-turn BPM variation is minimized in order to identify the synchronous frequency and phase.
In the presence of strong sextupole magnets, a singlepass beam-based alignment (BBA) procedure is most likely required in order to store beam while the perturbed lattice properties differ significantly from the design model. Hence, a model-independent BBA procedure based on 2-turn trajectories is implemented.
Performing linear optics correction is an essential step during machine commissioning. The LOCO method is implemented in AT and widely used for storage rings [17,18]. To this end an SC-LOCO interface in terms of a library SClocoLib() has been developed to allow for convenient application of the established LOCO workflow while using the SC data structure.

III. ALS-U ACCUMULATOR RING
The proposed lattice for the Advanced Light Source upgrade (ALS-U) [19] into a diffraction-limited soft x-rays light source is a 9-Bend Achromat reproducing the 12-fold symmetric footprint of the existing ALS [20]. The required small emittance is achieved by much stronger focusing than in the present ALS. Stronger focusing leads to larger natural chromaticities and smaller dispersion. Thus a large increase in sextupole strength is needed, resulting in small dynamic aperture on the order of 1 mm even for the ideal lattice.
Due to the small dynamic aperture, traditional off-axis accumulation injection is not feasible. Therefore, the ALS-U storage ring (SR) requires on-axis swap-out injection, which exchanges a spent bunch train with a replenished bunch train simultaneously. For this purpose a full energy accumulator ring (AR) [21] will be housed in the SR tunnel, acting as a damping ring for the beam from the booster and storing the beam for top-off in between swap-outs. Figure 3 shows a schematic drawing of the ALS-U facility and Table I reports the main AR parameters.
The ALS-U Accumulator Ring lattice is similar to the current ALS lattice, but adjusted to account for the smaller circumference. Since the Accumulator Ring is mounted on the inner wall of the Storage Ring tunnel (cf. Fig. 5) and in order to save costs in general, a major effort was taken to reduce the magnet size. For example, the weight of dipole magnets in the current ALS is about 3500 kg in contrast to the 1100 kg in the ALS-U AR.
Consequently the arc vacuum chamber aperture in the ALS-U AR is smaller than in comparable third generation light sources. A direct consequence is that higher order magnet multipoles, both systematic and random have a larger impact on the beam dynamics. At the same time the beam transmission, especially in early commissioning, is significantly affected by the smaller aperture.
In order to minimize dark time of the accelerator, the installation of the ALS-U AR is scheduled during regular ALS maintenance and two annual shutdown periods lasting several months. Beam based commissioning of the AR will take place during regular user operation of the ALS which limits the available number of beam injections into the AR significantly.
Therefore, although the ALS-U AR can be considered a third generation storage ring, defining an error tolerance budget and the commissioning of the machine may differ significantly from the experience made on previous machines.
To address the challenges posed by rapid commissioning and more in general to understand how realistic errors will affect the machine operation and to better define an error tolerance budget during the design process we have carried out complete simulations of machine commissioning. A detailed description of the correction chain, of which a preliminary version was presented earlier [15], will be described in Sec. IV. First, we describe the simulation setup and give a definition of the considered errors sources, including an analysis of their impact without any further correction.

A. Simulation setup and lattice
The AR triple-bend achromat lattice consists of 12 identical arcs, each equipped with 6 BPMs suitable for turn-by-turn measurement of the beam position and bunch charge.
In every sector two QFA magnets are located between three identical dipole magnets (BEND). Each of the two magnet families is powered by one power supply. Two sextupole families are located in the straight section and two sextupole families are located adjacent to the QFA magnets.  The individually powered quadrupoles QF and QD are placed in the straight sections. Horizontal and vertical corrector magnets (CM) suitable for slow trajectory correction are installed in six sextupole magnets per sector. Skew quadrupole corrector coils can be utilized in one sextupole magnet per sector. A schematic drawing of the lattice properties including the position of the CMs and BPMs is shown in Fig. 4.
A schematic drawing of one arc is shown in Fig. 5. The 17 magnets are mounted on 9 individual girders which are mounted to the inner wall of the storage ring cave and additionally supported by studs where the distance of the AR to the wall exceeds several cm. The three dipole magnets are each mounted on a separate girder. Sections 12, 1 and 2 are connected to transfer lines which requires a slightly different layout of the support structure. The girder placement, however, is identical.
The booster-to-accumulator (BTA) transfer line [22] transports the electron beam from the existing ALS booster to the accumulator ring. Simultaneous operation of the current ALS storage ring and the ALS-U accumulator ring requires us to switch the injections between the ALS operation and the AR commissioning. Thus, a dipole magnet will branch out of the existing booster-to-storage ring (BTS) line to the new BTA line.
Injection into the accumulator ring from the BTA is done off-axis in top-off mode to replenish the spent bunches swapped out of the storage ring. In contrast to the concept of a closed orbit bump as used in the ALS, the injection scheme for the AR involves two pulsed dipole kickers. After passing the injection septum, the injected bunch performs a large betatron oscillation in the horizontal plane through the first sector of the AR until it is kicked on-axis by a dipole kicker in the second sector [23]. In regular operation, a pulsed predipole kicker in sector 7 is used to condition the trajectory of the stored beam, thus preventing particle losses.

B. Error definition
This section provides an overview of the considered error sources. A summary of the corresponding values can be found in Table II.
Misalignments.-We assume transverse horizontal and vertical offsets of sectors, girders, and magnets within one girder as well as girder rolls and magnet rolls around the beam axis.
The overall offset of a particular magnet from the design axis is the sum of the offset of the sector, the offset resulting from the misaligned girder, and the individual magnet offset, see Fig. 2(b). The rolls are calculated analogously by summing up magnet and girder rolls. Figure 6 shows an exemplary offset distribution for the AR.
Magnet strength.-All magnets are considered to have fractional field strength errors of their main component of 0.1% rms. Horizontal and vertical CMs have a calibration error of 5% rms. Additionally, multipole errors are included as discussed below.
BPMs.-BPM errors considered in the commissioning simulation include calibration errors as well as rolls around the beam axis, offsets, noise for a single pass, and the stored beam reading. The value for the single pass noise presupposes a bunch charge of 0.4 nC.
Similar to the magnet misalignments, the overall BPM offset and roll are a sum of the misalignment of the corresponding girder and the individual BPM. Nonlinearities  Injection.-The design injected-beam size and injectedbeam systematic and jitter errors from the booster into the AR are listed in Table III. The rms beam size is determined by the booster beam properties, e.g., an emittance of 300 nm with 10% emittance ratio.
The transverse AR injected-beam systematic and jitter errors are based on the commissioning simulation of the BTA transfer line as described in Section IVA. The values for the longitudinal phase space are determined based on measurements at the current ALS Booster. The beta functions at the exit of the booster ring are considered to have an uncertainty of 50%.
Magnet multipole errors.-Magnet multipole errors can be categorized into systematic and random components. The systematic multipole errors are induced by deviations of the magnet design from an ideal magnet, e.g., with infinitely wide pole tips.
Systematic multipole errors from the primary coils are considered as well as those induced by powering the dipole and skew quadrupole corrector magnets.
OPERA-3D [24] was used to determine the effective multipole fields by integrating cylindrical multipole components along the beam trajectory. The ideal beam trajectory is calculated by tracking on-energy electrons starting from the specified beam center of the magnet. This procedure of calculating multipole field components is valid when proximity effects are negligible.
The systematic multipole errors induced by the primary component are scaled according to the magnet design set point. For the systematic multipole errors induced by the corrector coils we use a simplified model which assumes a static Gaussian distribution of corrector setpoints instead of updating the multipole errors each time the corrector setpoint is changed. We use a conservative estimate of σ CM ¼ 100 μrad and σ skew ¼ 0.1=m (normalized integrated gradient) for the dipole and skew quadrupole correctors, respectively. For each corrector magnet the systematic multipole errors are assigned randomly, drawn from the Gaussian distribution defined above and added to the contribution resulting from the primary coils of that magnet.
In addition to systematic multipole errors, random multipole errors are considered, e.g., those that result from manufacturing imperfections. Their values are calculated using a Monte Carlo simulation of various mechanical error realizations [25].
Aperture.-The inner radius of the circular vacuum chamber in the arc sections is 14.2 mm. However, the effect of vacuum chamber misalignments have to be considered. The clearance between the outer radius of the chamber and the magnets is 1.6 mm which gives an upper limit for the possible misalignment. For the aperture model used in the commissioning simulation we conservatively use the worst case projection, thus a circular aperture with a radius of 12.6 mm in the arc sections.
A detailed listing of the chamber design and the used aperture model for the complete lattice including the pulsed dipole kickers and the septa can be found in Table IV.
Circumference.-In the operation of storage rings a change in the orbit circumference (due for example to slow ground motion) is accommodated by a slight modification of the rf frequency to maintain the beam energy on target. This will be the case for the SR and during commissioning for the AR as well. However, this will not be possible for  the AR during normal operations since to synchronize the swap-out injection the rf frequencies of the two rings will be locked (at about 500 MHz). As a result, in the AR the response to a circumference perturbation occurring in either the AR or SR has to be an adjustment to the bending fields. Since bending is accomplished by combined-function magnets the adjustment entails a disturbance to the linear optics that will have to be corrected. The relevant perturbation is the difference between the two rings orbit circumferences rather than the absolute change. Because of their close proximity, the ground motion under the two rings can be expected to be highly correlated and the differential variation to be only a small fraction of the absolute ∼2 mm seasonal change observed in the ALS. Indeed, the estimate based on historical ALS data going back about nine years indicates an rms ΔC ≃ 125 μm differential deviation as shown in Fig. 7. The measured data are based on monuments placed around the ALS tunnel upper inner-wall where the AR will be anchored. In the simulations of the AR normal operation we assume a conservative ΔC ¼ 400 μm rms, corresponding to Δf ¼ 1.1 kHz rms. In the AR commissioning simulations we assume an initial ΔC ¼ 200 μm rms deviation between the design and realized circumference after the initial machine alignment.

C. Error analysis for the uncorrected lattice
The impact of errors before any correction gives a measure of the machine state in early commissioning and can be used to draw comparison between different machines. The ALS-U AR small magnet and vacuum chamber aperture enhances the machine sensitivity to multipole field-errors on the one hand and magnet misalignments and beam jitter errors on the other. We consider the two effects separately.
In a first set of simulations we consider the effect of magnet multipole errors (excluding dipole components), affecting the dynamic aperture but not the closed orbit. We distinguish between random and systematic multipoles. Among the systematic multipoles we distinguish between those contributed by the magnet primary and secondary (dipole and quadrupole skew corrector) coils, as detailed in the previous section. We calculate the dynamic aperture (DA) over 1024 turns. The horizontal/vertical beta functions at the observation point (straight-section mid point) are 15 m=5 m. Figure 8 shows the mean DA and variance over 100 lattice realizations with and without any physical aperture (cf.  7. Long term circumference measurement at different monuments inside the current ALS tunnel. Plotted is the circumference change with respect to the previous measurement as measured at the floor (blue), the inner wall at 2m (red) and near the ground (yellow) and via fiducial points on the ALS dipole magnets (purple) over 9 years of operation. Without physical aperture the DA reduction from the primary coil systematic multipole errors is significant (blue vs yellow). The further reduction after including the dipole CMs is also severe, while the impact of multipole errors attributed to skew corrector coils is negligible. However, the right plot reveals that the dynamic aperture is in fact dominated by the physical aperture and the subsequent reduction from multipole errors is relatively small.
In a more comprehensive analysis we included the errors listed in Tables II and III, and performed a scaling study in which all the errors are multiplied by a scaling factor. Thus, an error scaling factor of 1 corresponds to the nominal errors. For each lattice realization the rms closed orbit deviation is calculated (if the closed orbit is found by AT's findorbit6) as well as the dynamic aperture and the beta function distortion Δβ=β. The evaluation is performed with and without the physical aperture model.
Results for 500 error realizations are shown in Fig. 9. For the nominal errors the closed orbit exists in about 75% of the cases without aperture (upper left plot). However, when including the physical aperture the fraction decreases to 2%.
Similarly, the dynamic aperture including the physical aperture decreases to about 1=5 of its ideal value (cf. Fig. 8) at 50% of the nominal errors and is virtually zero above an error scaling factor of 0.7.
The lower left plot shows the rms closed orbit deviation for the horizontal and vertical plane and reveals that the small vertical aperture is indeed driving the performance decrease. The apparent improvement of the vertical closed orbit deviation with increasing error scaling factor is due to the fact that only for lattice realizations with sufficiently small vertical orbit deviation the closed orbit is within the physical aperture and therefor exists.
For comparison, a similar picture for a fourth generation light source like APS-U shows that only 10% [7] of the lattice realizations with errors scaled to half of their nominal values allow for the existence of a closed orbit in the absence of any corrections in contrast to 60% at the ALS-U AR, whereas for the ALS-U storage ring the closed orbit exists in 10% of the cases for an error scaling factor of only 0.2.
However, considering the rms orbit deviation and the dynamic aperture at the nominal errors it can be concluded that virtually no injected particle is expected to be captured by the AR without further lattice correction, which will be described in the following.

IV. SIMULATED COMMISSIONING
In this section we describe in detail the commissioning strategy we have developed for the ALS-U accumulator ring. After a methodical evaluation of alternate paths and statistical analysis of outcomes we have identified the following sequence as yielding the best performance:

(A) Beam injection into the AR (B) Improve initial transmission (C) Sextupole ramp-up (D) rf correction (E) Trajectory based optics correction (F) Beam based alignment (G) Closed orbit correction (H) LOCO based optics correction (I) rf frequency adjustment
Each step will be described in detail in the following. The implemented correction chain is available as a MATLAB® script on the SC homepage [16].
RMS machine-error and injected beam trajectory error realizations are assigned according to the values reported in Tables II and III, with each error source following a Gaussian distribution truncated at AE2σ. The injected bunch is represented by a six-dimensional AE3σ-truncated Gaussian distribution of 400 particles with rms sizes also reported in Table III. The presented results of correction steps are typically shown for a population of 200 error realizations.
Synchrotron radiation is generally included in every lattice element. We emphasize that commissioning starts with rf cavities and the sextupole magnets are switched off.

A. Beam injection into the AR
The beam-dynamic simulation starts at the exit of the booster as the beam enters the existing Booster-to-Storage-Ring (BTS) transfer line. We perform the study using the BTA lattice including the BTS and the first section of the AR up to the first BPM downstream of the dipole kicker in sector 2, which is designed to kick the injected beam on axis during commissioning. The transition from the BTA coordinate system into the AR coordinate system is performed by applying a horizontal offset and kick angle to the beam when entering the AR, thus using the T1 field of the first AR lattice element. The pulsed dipole kicker is switched on for the first turn and the ideal injected beam trajectory is used as a reference for the upcoming trajectory correction.
After applying the nominal errors to the lattice the beam usually gets lost at the transition from the old BTS to the BTA at around s ¼ 24 m due to aperture reduction at that location (cf. Fig. 10). For trajectory correction we use an iterative feedbacklike approach which is described in Appendix B. Figure 10 shows results of the beam transmission after successfully applied trajectory correction. The mean transmission is about 98% and dominated by losses occurring in the septum at around 40 m.

B. Improve initial transmission
Having established successful transmission through the first arc, from now on for simplicity we start the beam simulation from the location of the pulsed dipole kicker in sector 2 with the injected beam errors listed in Table III. An example of beam injection into the AR without any additional trajectory correction is shown in Fig. 11. In this example, the particles hit the vertical aperture in the dipole magnets, causing a beam loss of more than 60% of the particles at about 60 m. Figure 12 shows a cumulative distribution function of the beam-loss location for many error realizations. On average the beam gets lost within the first half turn. The first step in the correction chain is therefore to establish transmission throughout one turn.
To this end we employ a feedback-like iterative trajectory correction approach, which is described in detail in the Appendix B. After this initial trajectory correction is carried out and the machine shows full transmission over one turn, we set the goal to reach two-turn transmission. This is achieved by "stitching" the BPM readings in the second turn to the readings of the first turn. At first only a small number of up to six BPMs located in the first sector is used for stitching. Once beam transmission through the full second turn is established, all BPMs are included in the feedback algorithm in order to minimize the overall BPM reading.
Finally, the BPM readings in the first turn are used as defining the reference trajectory for the BPM readings in the second turn. This corrects the machine to a state with a period-one orbit and typically yields full transmission through a large number of turns. The center plot in Fig. 12 shows the beam transmission after the described trajectory correction sequence. The average transmission (about 110 turns) is close to the number of turns that in an ideal lattice a beam survives with the sextupoles turned off. This suggests that the tuneshift due to the large chromaticities, about −30= − 40 in the horizontal/vertical planes, is the transmission limiting factor at this point. For comparison, in the ideal lattice with the rf cavities turned off radiation loss would start to cause losses only after about 650 turns. The next consequential step in the correction chain is therefore to switch on the sextupole magnets.

C. Sextupole ramp-up
In addition to beam loss, the natural chromaticities cause decoherence that quickly degrades the BPM readings within a few turns. This is another reason to turn on the sextupoles at this point since turn-by-turn evaluation of the betatron oscillations over several turns as needed for Ramping up the sextupoles in steps of 1=10 of their nominal strength while applying the previously described trajectory feedback after each step turned out to be successful in 100% of the cases and the beam transmission is increased significantly (see Fig. 12).

D. rf correction
After final optics correction, the rf frequency will be adjusted to meet the requirements of the storage ring (see Sec. IV I). At this point of commissioning, however, the storage ring is not in operation and the rf frequency and phase of the AR cavity can be corrected such that the injected beam is longitudinally launched on the closed orbit. The implemented correction routines make use of the fact that a turn-by-turn (TBT) energy variation will result in a TBT horizontal BPM variation due to dispersion. Thus, the average horizontal BPM difference of all BPMs between two turns is a measure of the energy gain or loss of the bunch.
At first, the rf phase of the cavity is changed in steps within AEπ and for each step the BPM readings are evaluated over 25 turns. Since the synchrotron tune is 185 turns, the evaluated period covers only a small fraction of a revolution, hence a good approximation of the "local" longitudinal phase-space motion at injection. The average horizontal TBT BPM variation is evaluated as a function of the rf phase, a sine function is fitted and the zero crossing is identified as the synchronous phase. See Fig. 13 for an example.
Considering a well corrected rf phase, the rf frequency is corrected similarly by evaluating the mean TBT horizontal BPM variation over 130 turns as a function of a frequency change within AE1 kHz. A straight line is fitted and the zero crossing is identified as the synchronous frequency.
Good correction accuracy of either phase or frequency requires good correction of the other. In order to catch rare cases of, e.g., an unfortunate combination of a large circumference and frequency error, both corrections are performed in a loop with three iterations. The final corrected phase and relative energy error between the injected beam and the closed orbit is 1.2°and 2 × 10 −5 , respectively. This is a satisfactory result considering the relatively large longitudinal size of the injected beam as shown in Table III.

E. Trajectory-based optics correction
At this point the beam survives 20000 turns, thus more than two damping times, which is our definition of beam capture, in 98% of the cases. Nevertheless, in order to achieve beam capture in all cases, linear optics correction is performed.
We studied different trajectory-based linear optics correction strategies. Fitting the injection trajectory and a limited amount of quadrupole K-values to match the (single/multiturn) BPM readings with the design trajectory turned out to perform poorly, in particular because of the large BPM offsets at this stage of commissioning. Circumventing these errors by evaluating only a difference trajectory, e.g., a response to a CM in the BTA transfer line, was not successful either due to the large number of error sources compared to the number of observables.
Performing a LOCO-like optics correction strategy with a (single/multiturn) trajectory response matrix showed promising results. However, the signal to noise ratio was found to be a critical parameter and considering the relatively large injection jitter, a reasonable trajectory response matrix measurement required several hundred beam injections.
It turned out that the most efficient way at this point is a simple but robust tune scan, while postponing an accurate optics correction scheme until the beam is captured. For the tune scan the quadrupole families QF and QD are exercised coherently on a grid of K F and K D values on a spiral like patterns until the beam transmission after 500 turns is above 80%. A low number of turns with a high transmission was found to be a good approximation of beam capture while minimizing the computational costs of the evaluation.
The final transmission after the tune scan at 20000 turns is above 70% in all cases and the beam can be considered captured. Note that a beam loss is considered at a transmission of less than 40%.

F. Beam based alignment
The beam based alignment (BBA) routine for stored beam is not yet implemented in SC. However, successful routine operation at the ALS [26] indicates that performing BBA at the ALS-U AR after achieving beam capture will be straight forward. Based on measurements at ALS we proceed by conservatively crediting BBA for a reduction of BPM offsets to 50 μm rms.

G. Closed orbit correction
After reducing the BPM offsets a more ambitious closed orbit correction can be applied in order to reduce feed down optics perturbations from sextupole magnets and other higher order multipoles.
At first, the actual response matrix is measured as well as the dispersion by changing the rf frequency. The previously described orbit feedback is applied including dispersion, thus with the rf frequency as an adjustable parameter. The correction is performed in a loop with a successively decreasing regularization parameter α for the calculation of the pseudo-inverse matrix.
If the feedback algorithm returns an error, the CM setting of the last iteration is used and the loop is stopped. If the algorithm converged, the final rms BPM reading is compared to the initial reading. If there was no improvement, the CM setting of the last iteration is used and the loop is stopped as well. Thus, the α-loop is stopped if a decreased α did not result in a decreased rms BPM reading, which typically happens because the calculated CM setpoints exceed their limits.
Results are shown in Fig. 14. The final closed orbit deviation is about 120 μm rms and the required corrector strength is well within the 0.2 μrad limit. Typically, the last successful iteration was using a regularization parameter between 3 and 5. This is consistent with the results shown in Fig. 17 which indicate the regularization strength at which the AR CM limits prevent a smaller BPM reading.
The average number of required beam injections is 192 with a standard deviation of about 9 injections. This small relative spread within the error realizations indicates that the required number of injections for the correction chain is dominated by the fixed number of steps in scans like, e.g., the rf commissioning, which is equal for all error realizations.
It is worth mentioning that for more challenging error assumptions encountered during magnet multipole scaling studies the mean required number of injections quickly exceeds 1000 with a relative spread of up to ffiffiffiffiffiffiffiffiffi ffi hN 2 i p = hNi ≤ 1. This relatively large spread indicates that correction algorithms like e.g., the first turn threading with its wiggling or the tune scan which both may include many injected beams depending on the actual error realization are more involved in the overall correction chain.

H. LOCO-based optics correction
At this point of the commissioning process linear optics correction can be performed. As described in Sec. II B, the well established LOCO algorithm [18] can be used conveniently within the SC framework.
The developed correction sequence for the ALS-U AR consists of different steps, each followed by orbit correction using the previously described algorithm. The first step includes a coarse correction using all QF and QD quadrupole magnets while at first ignoring coupling (off-diagonal response matrix blocks) and diagnostic errors. After two iterations, calibration factors of the BPMs and CMs are fitted as well. Thereafter LOCO is applied in a loop with a chromaticity correction. All QF, QD quadrupoles are used as well as all available skew quadrupole correctors. Coupling and diagnostic errors are included in the fit. A simulated beam-based chromaticity correction is not yet implemented, instead we use a simple fitting scheme on the assumption, based on ALS operational experience and modelling, that the chromaticity can be measured and corrected without problems.
Results shown in Fig. 15 indicate that all requirement have been met. E.g., the horizontal emittance is below 2 nm with less than 1% coupling and the corrector limits are not exceeded. The relatively large excursion of QD values is due to the fact that its K-value is about 10 times smaller than for the QF magnets. The dynamic aperture (see Fig. 16) can be well corrected.

I. rf frequency adjustment
The last step in the commissioning process of the AR is the frequency adjustment once the SR goes into operation, see Sec. III B.
We assume an initial 0.2 mm rms SR-circumference error and conservatively estimate the differential circumference change between the AR and the SR to be 0.2 mm rms as well. The sum of these two errors (0.4 mm) corresponds to a 1.1 kHz AR rf frequency error. We simulate the impact of the differential circumference variation by adding a random error Δf with 2σ-truncated normal distribution and 1.1 kHz rms spread to the current AR rf frequency as determined during the commissioning simulation. For a given Δf realization, the bendingmagnets field strength is scanned to identify the setting yielding the nominal beam energy. For a 1.1 kHz frequency step this induces a beta beat of 3% and 0.5% in the horizontal and vertical plane, respectively. The QFA family quadrupoles are then adjusted (with relative adjustment equal to that of the bending magnets), followed by orbit correction, and finally a LOCO-based linear-optics correction including all quadrupoles. Figure 16 shows that the correction is effective at restoring the dynamic aperture. The mean and standard deviation of the horizontal and vertical emittance before the rf frequency adjustment is ϵ x ¼ 1.820 AE 0.004 nm and ϵ y ¼ 4.5 AE 3.2 pm, respectively. After the frequency adjustment the values are ϵ x ¼ 1.822 AE 0.025 nm and ϵ y ¼ 4.7 AE 4.7 pm. Thus, a slightly increased emittance spread throughout the lattice realizations can be observed which is within acceptable limits.
The errors in Tables II and III are thus considered as tolerable.

V. SUMMARY AND CONCLUSION
For 4th generation storage-ring light sources the ability to carry out commissioning effectively and rapidly is crucial. To prepare for this task we have developed an extension to the MATLAB®-based Accelerator Toolbox (AT), the Toolkit for Simulated Commissioning (SC), which allows for realistic simulations of the commissioning process of storage rings. The toolkit was used to perform a start-to-finish commissioning simulation of the ALS-U Accumulator Ring, a MATLAB® script of which is available on the SC homepage [16].
We have succeeded in identifying an effective sequence of commissioning steps, including trajectory/orbit correction, commissioning of the rf cavities and linear optics correction. For trajectory control we use an iterative feedback-like approach based on the Tikhonov regularization of the SVD pseudo-inverse, which yields a convenient handle to trade-off the final rms BPM reading versus the rms CM strength.
Due to the locked rf frequencies of the ALS-U AR and SR, the AR synchronous energy has to be adjusted for by exercising (combined function) dipole magnets. We have shown that within the expected limits of differential ground motion between the two rings the resulting optics perturbations on the AR lattice can be sufficiently well restored.
In detailed studies not reported here the outcome of the commissioning simulation, thus the performance of the corrected lattices was used to identify the proper placement and the required number of BPMs, dipole-and skew quadrupole corrector magnets. The SC toolkit and the described procedure was also used to set multipole field error tolerances and to define an overall error tolerance budget of both the AR and the BTA transfer line. Furthermore, the injection efficiency of various injection schemes has been evaluated under the presence of realistic errors and the AR aperture requirements have been determined.
It can be concluded that the SC toolkit is well suited to support the design process of storage rings, in particular because of its elaborate error model and the ability to realistically correct a large number of disturbed lattices during simulated commissioning. The current ALS-U converges to the limit ⃗ c n → ⃗ c ∞ given by the pseudoinverse . Often a precise measurement of the response matrix may be unfeasible or inconvenient and one would have to base M on the ideal lattice model, thus only approximately representing the response matrix of the physical systemM ¼ M þ ϵ, where ϵ is a (matrix) perturbation. Fortunately, the algorithm appears to remain generally robust against perturbations that occur in practice, consistently yielding physically meaningful (and useful) solutions. A rigorous proof of convergence can be established if the perturbation ϵ is sufficiently small [27].

APPENDIX B: APPLICATION OF THE TIKHONOV REGULARIZATION IN ITERATIVE TRAJECTORY CORRECTION
In our iterative trajectory correction approach, at each step the setting assigned to the correctors is calculated from the current BPM-readings based on a regularized pseudoinverse of the model response matrix of the lattice. In contrast to other approaches (see e.g., [7]) which often start by down-selecting the BPMs and correctors from all those available in order to favor more desirable outcomes (smaller corrector strengths, reduced BPM readings) at each iteration step, all relevant parameters (including number of CMs and BPMs to be utilized) is set at the start and kept fixed over the course of the iteration. The idea is to choose these parameters once and then let the system of BPM-readings and CM-settings evolve without regard as to whether every single iteration step yields the most orbiterror reduction.
We found that the iteration will eventually converge to the desirable outcome if a suitable regularization is chosen for the inverse response matrix (IRM). In numerical experiments we found the Tikhonov regularization (TR) to be most effective, see Appendix A. The Tikhonov-regularized IRM is calculated similarly to the well-known Moore-Penrose pseudo-inverse, based on the singular-value decomposition (SVD) of the response matrix. In the TR however, the singular values are modified depending on a continuous parameter α. This parameter effectively governs the trade-off between the accuracy of a correction step and the required overall change of the CM-kicks. Smaller α will generally yield better rms orbit correction but at the cost of stronger corrector settings. For the AR lattice, we found that suitable choices for α vary from a few 10s, when the algorithm is applied to maximizing transmission, to a few units in the application to orbit correction (see Sec. IV G). See Ref. [27] for a more formal investigation of the merits of TR.
This approach has shown to work very well even in the very first stage of trajectory-correction where the goal is not to reduce the orbit variation but merely to "thread" the beam through the machine in order to produce full transmission for the first time. In this case we still use the complete response matrix including all CMs and BPMs independent of whether the beam reaches them or not; readings of BPMs that do not see beam are set to zero and the correction is only applied to the CMs preceding the last BPM with useful signal.
In the rare cases where the iteration gets "stuck," typically on especially challenging physical apertures, an effective solution is to start varying ("wiggling") the last CM preceding the point of beam loss over a suitable range and keep adding immediately upstream CMs to the wiggling until the beam proceeds further and the normal iteration scheme can resume.
We should emphasize that the trade-off between BPM readings and CM strengths made possible by appropriately choosing the parameter α strictly speaking can only be exercised at each individual correction step and in general does not apply at the point of convergence of the iteration. Indeed, in a completely linear system, it can be proved [27] that, when the algorithm has converged, the CMs settings and BPM readings are independent of the choice of regularization parameter α. However, this does not diminish the utility of this control parameter. For one thing convergence would be prevented if along the way the required CMs strength exceeded their limits.
For another, precisely because the values of the CM settings at convergence may be undesirable or because of noise, one may want to terminate the iteration of the correction scheme before convergence is reached. This early termination restores α's influence on the final state of the system, as illustrated in the following example.
For the demonstration we use one random error realization and apply the correction chain for improving the initial transmission as described in Sec. IV B including various feedback algorithms while using different regularization parameters α to calculate the pseudoinverse of the 1 and 2 turn response matrices. Figure 17 shows the final rms corrector strength as a function of the rms BPM reading. Each data point corresponds to one regularization parameter α. For each value of α the trajectory correction scheme was applied with and without considering corrector limits.
One clearly notices the inverse dependence of the final rms BPM readings on the final rms CM strengths with a monotonic dependence on the regularization parameter α. When CM limits are included, however, at a sufficiently small α the calculated CM setpoints cannot be reached and the less regularized pseudoinverse of the response matrix is in fact increasing the BPM readings.
This clearly demonstrates that the regularization parameter α effectively provides a means to trade off the BPM reading at the end of a correction chain against the overall strength of the corrector magnets.