Model-independent tuning for maximizing free electron laser pulse energy

The output power of a free electron laser (FEL) has extremely high variance even when all FEL parameter set points are held constant because of the stochastic nature of the self-amplified spontaneous emission (SASE) FEL process, drift of thousands of coupled parameters, such as thermal drifts, and uncertainty and time variation of the electron distribution coming off of the photo cathode and entering the accelerator. In this work, we demonstrate the application of automatic, model-independent feedback for the maximization of average pulse energy of the light produced by free electron lasers. We present experimental results from both the European x-ray free electron laser at DESY and from the Linac Coherent Light Source at SLAC. We demonstrate application of the technique on rf systems for automatically adjusting the longitudinal phase space of the beam, for adjusting the phase shifter gaps between the undulators, and for adjusting steering magnets between undulator sections to maximize the FEL output power. We show that we can tune up to 105 components simultaneously based only on noisy average bunch energy measurements.


I. INTRODUCTION
Free electron lasers (FEL) are incredibly powerful scientific tools for studying physics at previously inaccessible length (nanometers) and time [femtoseconds (fs)] scales for high energy physics, biology, chemistry, material science, and accelerator physics experiments [1][2][3]. FELs can produce extremely short (∼fs) coherent X-ray bursts with tunable wavelength which are many orders of magnitude brighter than traditional sources such as synchrotrons. For example, the Linac Coherent Light Source (LCLS) FEL, the first hard x-ray FEL, provides users with photon energies ranging from 0.27 keV to 12 keV based on electron bunches with energies from 2.5 GeV to 17 GeV. Operating electron bunch charge can range from 20 pC to 300 pC and the bunch duration from 3 fs to 500 fs to suit experimental needs [4][5][6]. The European X-Ray FEL (EuXFEL), one of the newest and most advanced FELs in the world, is capable of producing 27000 pulses of bright, coherent light per second utilizing electron bunches with energies of up to 17.5 GeV, with charges ranges from 0.02 to 1 nC per bunch, and photon energies from 0.26 keV up to 25 keV [7][8][9]. Precise control of bunch lengths, current profiles, and energy spreads of increasingly shorter electron beams at femtosecond resolution is extremely important and challenging for both the LCLS and the Eu-XFEL [10,11].
The extremely bright and short x-ray bursts that make FELs such powerful instruments also makes them incredibly challenging to control. High power FELs are driven by few kilometer long high power particle accelerators composed of thousands of interacting electromagnetic components including radio frequency (rf) accelerating cavities and magnets. The performance of all of these components is susceptible to drift, e.g., such as thermal drifts. Timing is a critical issue because for certain pump-probe experiments, the flashes of light themselves are only tens of fs. Novel techniques have been developed to achieve ∼10 fs timing synchronization of FELs [12,13] and this work continues.
Another challenge is the FEL lasing process itself, selfamplified spontaneous emission (SASE) is stochastic in nature and extremely sensitive to the beam's initial conditions including charge density and energy spread, therefore there is a large variance in the output power of FEL light even when all machine set points are fixed and properly timed because of uncertainty in and time variation of the electron distribution coming off of the photo cathode and entering the accelerator. Therefore, the output power of an FEL is a highly nonlinear, time-varying, noisy and analytically unknown function of all of the FELs thousands of components. Traditional model-based approaches are severely limited by such uncertainties and time variation of both the accelerated beam's phase space distribution and the accelerator's components as well as misalignments, thermal cycling, and collective effects such as space charge forces, wakefields, and coherent synchrotron radiation emitted by extremely short high current bunches.
One example of such difficulties is the process of reconfiguring the LCLS to a low charge mode to provide 3 fs bunches, a process which may require many hours of expert hands on tuning. These difficulties will only grow for future facilities, such as the LCLS-II [14] for complex schemes such as multi-color operation [15], multi-stage amplification [16] or self-seeding [6,17]. Plasma wakefield accelerators (PWFA) are another class of particle accelerators which require extremely intense, high current and sometimes extremely short charged particle bunches with complex beam dynamics and phase space manipulations and could benefit from the type of automatic tuning performed in this work [18][19][20]. For example, the facility for advanced accelerator experimental tests (FACET-II) is being designed to provide custom tailored current profiles for various experiments with bunch lengths as low as 1 or 2 fs [21,22].
The challenges described above make the FEL an interesting candidate for optimization and tuning via modelindependent feedback. Of particular interest is the use of automated algorithms for: 1. Initial tuning, in order to achieve SASE, following an outage or major readjustment of any accelerator settings. 2. Maximizing output power/energy per pulse once SASE is established. 3. Maintaining optimal SASE despite time-varying drift of the beam and accelerator settings.
In this work we demonstrate a general adaptive feedback algorithm for automatic accelerator tuning with in-hardware demonstrations on both the LCLS FEL and the European XFEL for laser pulse energy maximization [23][24][25][26][27]. Previously, this method has been combined with neural networks for the automatic tuning and control of the longitudinal phase space of electron bunches in the LCLS FEL [28]. We demonstrate application of this technique on RF systems for automatically adjusting the longitudinal phase space of the beam, for adjusting the phase shifter gaps between the undulators and for adjusting steering magnets between undulator sections to maximize the FEL output power. We show that we can tune up to 105 components simultaneously.

A. Main results
Our main results can be summarized as: 1. We have, for the first time demonstrated in-hardware tuning of an order of magnitude increase in the number of parameters (>100 vs ∼5). 2. We have demonstrated that this method can aid operators and beam line physicists in establishing initial SASE on various components. 3. We tuned the longitudinal phase space of the EuXFEL (5 parameters) and the LCLS (6 parameters) to maximize SASE power. 4. We tuned phase shifter gaps (up to 21 simultaneously) to optimize for proper phase shift for constructive superposition of emitted light in the multi-segment undulators to occur. 5. We tuned aircoils (up to 84 simultaneously) to optimize the beam orbit through the undulator sections to maximize FEL light energy. 6. We tuned both aircoils and phase shifter gaps simultaneously (105 parameters) to help in initial SASE setup. 7. We demonstrated that all of these methods were robust to time-variation of the accelerator, the beam, and to noisy readings. 8. We implemented this method in the OCELOT accelerator tuning package with an automatic tune setup for nonexperts in the control room to use with a push of a button.

II. TUNING ALGORITHMS
Recently, many advanced controls approaches have been developed toward automatic tuning, control, and optimization of particle accelerators [29]. In this section we briefly review several families of approaches to automatic accelerator tuning, before describing the procedure implemented in this work.

A. Genetic algorithms
For the problem of tuning coupled components which have a deterministic effect on the particle beam, there is a large family of optimization schemes which take place offline [30]. Genetic algorithms (GA) and multiobjective genetic algorithms (MOGA) have been successful for the design and optimization of radio frequency cavities [31], photoinjectors [32], damping rings [33], storage ring dynamics [34], lattice design [35], neutrino factory design [36], simultaneous optimization of beam emittance and dynamic aperture [37], free electron laser linac drivers [38] and various other accelerator physics applications [39]. Multiobjective particle swarm optimization, an extension of MOGA, has recently been demonstrated for emittance reduction, with convergence rates exceeding those of MOGA approaches [40].
Genetic algorithms search over a large parameter space and result in global optimization, however, model-based results are optimal only relative to the chosen model. Once a machine is actually built, further tweaking is required due to imperfect models and finite precision of construction. Recently, the GA method has been demonstrated online, successfully minimizing the vertical beam size of the SPEAR3 storage ring [41]. Another optimization method is Robust conjugate direction search (RCDS), a local (may be trapped in local minima) model-independent algorithm which is able to optimize many parameter systems [42,43]. RCDS and particle swarm have also been used for online optimization of nonlinear storage ring dynamics [44]. Both the GA and RCDS approaches are best suited for time-invariant systems, an RCDS method for dealing with slowly drifting systems is under consideration, but further development is needed.

B. Machine learning
Recently, powerful machine learning (ML) techniques have been studied for various particle accelerator applications. ML-based tools, such as neural networks (NN), can be trained to automatically tune and control large complex systems such as particle accelerators [45][46][47][48]. ML tools are being developed to provide fast and accurate surrogate models to create diagnostics that enable feedback control and tuning of accelerators [49]. A NN model has been designed to predict the resonant frequency of the radio frequency quadrupole (RFQ) in the PIP-II Injector Experiment (PXIE), to be used in a model predictive control scheme [50]. In a preliminary simulation study for a compact THz FEL, a NN control policy was trained to provide suggested machine settings to switch between desired electron beam energies while preserving the match into the undulator and a fast surrogate model was also trained from PARMELA simulation results in order to facilitate the training of the control policy [51].
Powerful NN tools have also been developed for ML-based longitudinal phase space prediction of transverse deflecting cavity readings in particle accelerators, which are some of the most important diagnostics that exist for measuring a beam's longitudinal phase space [52,53]. A novel Bayesian optimization framework that uses sparse online Gaussian processes has been applied for quadrupole magnet tuning in an FEL [54]. Bayesian optimization methods have also been developed for maximizing FEL pulse energy [55]. Various ML tools, including clustering for identifying faulty beam position monitors (BPM) using outlier detection and ML methods for optics corrections has been developed and performed at CERN [56][57][58][59][60]. For more examples and details the reader is referred to [47,48] and the references within.
One limitation of ML is that ML alone, especially supervised learning techniques, may be insufficient for particle accelerator systems because they are time-varying and their learned characteristics are drifting. This is particularly true for linacs, FELs and PWFAs, such as LCLS, the Eu-XFEL, FACET, and the Los Alamos Neutron Science Center, and may less critical for storage rings. The ML tool must also be able to interpolate between training points, which is more difficult for complex, many-parameter systems. Recently, the adaptive modelindependent method utilized in this work was combined with ML methods for the automatic tuning and control of the longitudinal phase space of electron bunches in the LCLS FEL [28], with the ML performing approximate global tuning and the adaptive algorithm zooming in on and tracking optimal component settings despite noise and time varying drifts.

C. Model-independent feedback
Several automatic model-independent feedback methods have been under development at various accelerator facilities. At FERMI an automatic tuning method has been utilized to maximize FEL output energy, tuning between 1 and 5 components simultaneously [61,62]. OCELOT, an accelerator simulation and optimization toolbox has been developed for implementing various automatic tuning algorithms in accelerators [63,64]. OCELOT has been implemented in the EuXFEL for dispersion minimization, orbit distortion compensation with aircoils, beam loss minimization and photon pulse energy maximization via the Nelder-Mead simplex method.
In this work, we utilize a local, model-independent extremum seeking (ES) algorithm, whose convergence can also suffer due to local minima, but whose simplicity and speed of convergence allows for real time tracking of a many parameter time-varying nonlinear system. We utilize a recently developed, bounded adaptive feedback technique which is applicable for open-loop unstable many parameter noisy systems and critically, can handle unknown timevariation of the system dynamics [23][24][25]. This general method has been studied for controlling autonomous torque-actuated unicycle vehicles [26], for creating noninvasive longitudinal phase space diagnostics at FACET [27]. This algorithm is currently being added to the family of optimizers available in the OCELOT framework.
Application of the algorithm requires a user-defined cost function, which may be analytically unknown, but which depends on accelerator parameter settings and must be maximized or minimized. An example of such a function is the average pulse energy at the output of the FEL, which depends on all of the FEL parameters, such as rf systems and magnet settings. Mathematically, we represent the analytically unknown, time-varying cost function to be maximized by Cðp; tÞ and its noise-corrupted measurement asĈðp; tÞ ¼ Cðp; tÞ þ nðtÞ, where p ¼ ðp 1 ; …; p n Þ are the parameters being tuned, such as, for example, quadrupole magnet field strengths.
Our simple algorithm adjusts the parameters p j according to the dynamics where ω j ¼ ωr j and r j ≠ r i for i ≠ j. The term α > 0 is the dithering amplitude and can be increased to escape local minima. Once the dynamics have settled a parameter p j will oscillate around a local minimum with amplitude ffiffiffiffiffiffiffiffiffiffi α=ω j p . The term k > 0 is the feedback gain. For minimization, instead of maximization, we simply choose k < 0. For large ω, the dynamics of (1) are given, on average, by the simple dynamics [23][24][25] (see the Appendix): a gradient ascent, with respect to p, of the actual, analytically unknown function Cðp; tÞ although the feedback is based only on the noisy measurementsĈðp; tÞ. See the Appendix and the references for more details. Intuitively, the reason behind this convergence is that by dithering each parameter at a unique frequency the evolution of the parameters has been made orthogonal in Hilbert space in the form of the L 2 ½0; t inner product: The resulting averaged dynamics maximize a noisy, analytically unknown, time-varying function. Advantages of this approach include (1) An ability to continuously, dynamically tune many parameters of unknown, nonlinear, and open-loop unstable systems, simultaneously.
(2) Robustness to measurement noise and external disturbances and ability to track fast time-varying parameters. (3) Analytically guaranteed constraints on parameter update rates: j dp j , even when operating on noisy and analytically unknown systems, an important safety feature for in-hardware implementation.
The procedure for digitally applying (1) is an iterative method in hardware via the finite difference approximation of (1) given by: which is an accurate approximation of the derivative in (1) for Δ < 2π maxfω j g ≪ 1.
Here t n represents the actual time at which the iterative update takes place. From now on, for simplicity, we will simply write CðnÞ in place of C½pðnÞ; t n to emphasize that it is the cost function measurement at iterative step n. Implementation begins by setting parameter values to some initial conditions, pð1Þ, recording the cost function Cð1Þ, and then performing the update: and continuing, according to (4) for n ¼ 2; 3; …. The update rate, that is the time between setting parameter values from pðnÞ to pðn þ 1Þ is chosen depending on how fast the physical components being tuned can change and how fast that change can be detected in the noisy cost functionĈðnÞ. When dealing with a large family of parameters whose values span many orders of magnitude, limits are defined for all parameters and then they are all normalized to in a range of AE1. The parameter updates are then carried out on the normalized values which are then unnormalized to physical set points for implementation in the accelerator. The experiments carried out at the LCLS and at the Eu-XFEL, for maximizing average photon pulse energy, as described in the following sections demonstrated the robustness of the algorithm to noise as the shot-to-shot pulse energy at both facilities sometimes varied by as much as 10% even when all parameter values were held constant. This is particularly clear in Fig. 5 where the parameters were all held constant during the first and last 200 pulse energy measurements and can be seen to jump within a ∼100 μJ which is a ∼25% variation at the low energy range of ∼300 μJ and ∼10% variation at the higher energy range of 1000 μJ.

III. EUROPEAN XFEL
The European XFEL is one of the newest and most advanced FELs in the world with a capability of running 27000 pulses per second to three separate undulators, with plans to expand to five undulators in the future. A simplified view of the EuXFEL is shown in Fig. 1. Successful achievement of SASE and subsequent maximization of the FEL output power is a lengthy, iterative process which requires tuning the entire machine from the injector up to the undulator parameters. At the EuXFEL, we demonstrated our technique on various combinations of accelerator components in order to develop tools that can aid the operators in achieving SASE setup. We have incorporated our algorithm into the OCELOT framework so that it is available as a tool for operators, having the freedom to choose any combination of components to tune based on the objective function of their choice (usually energy maximization), with an autotuning feature that is described in Sec. V.

A. Longitudinal phase space
The rf systems control the longitudinal phase of the beam [65]. The longitudinal dynamics depend on a highly coupled set of rf-parameters. For longitudinal phase space tuning we simultaneously tuned five parameters, p ¼ ðp 1 ; …; p 5 Þ which are shown in Fig. 1. 1. The Injector chirp (I1 chirp), which controls the compression and peak current of the beam in the bunch compressor BC0. 2. Injector curvature (I1 curvature), which helps produce symmetric beam profiles without spikes. 3. Injector third derivative (I1 third derivative), which does not influence the core of the beam distribution, but helps to optimize RF parameter sensitivity. 4. The Linac 1 chirp (L1 chirp), which controls the compression and peak current of the beam in the bunch compressor BC1. 5. The Linac 2 chirp (L2 chirp), which controls the compression and peak current of the beam in the bunch compressor BC2. In addition, it should be pointed out that the complexity of the problem is increased by the fact that all upstream parameters influence the downstream peak currents. The chirp affects the peak current, the curvature influences the shape of current profile and the third derivative has an influence on the spikiness of the beam.
The results of one rf tuning experiment are shown in Fig. 2, where a 10 point moving average of the average bunch energy was utilized as the cost function being maximized. rf parameters were adjusted at a rate of ∼1 Hz and the beam average bunch energy is seen to grow from ∼700 μJ to ∼1000 μJ over the course of 30 seconds. Although the SASE in the EuXFEL can reach much higher energy per bunch with additional tuning of various components, such a procedure was found to be useful during the initial SASE startup phase to find reasonable parameter settings. This is especially useful because the actual rf read-backs suffer from slow drift throughout the accelerator and therefore even once proper set points are established for the rf system, drift requires periodic phase scans to reestablish new proper set points. This is especially problematic at start up when the machine might initially be very far from the required optimal rf conditions. The method demonstrated here not only helps to initially find a good neighborhood of the rf settings, but can also be run all of the time to continuously readjust rf settings in order to track optimal conditions despite uncertain time variation of the machine. Furthermore, in some ways the FEL's output power is sensitive to and influenced by any variation in the rf system in terms of its influence on beam orbit. Although there are orbit feedbacks along the accelerator that keep the orbit as measured by beam position monitors (BPM) constant, there may be cases where even though feedbacks maintain a constant orbit (BPM measures the center of gravity of the bunch), the lasing slice may have a different orbit due to the CSR kick, and this slice offset may depend on rf settings. This couples the effect of the phase shifter gap settings, air coil settings, and the rf system on the average pulse energy.

B. Phase shifter gaps
The EuXFEL has multisegmented undulators consisting of 5 m long modules with phase shifters, orbit correctors, and quadrupoles in between. The phase shifters are permanent magnet systems in which the gap size between magnet poles controls the beam's path length, thereby adjusting the phase shift between the beam and the light which has been produced by the SASE process. We adjusted these gaps in order to maximize the constructive superposition between the beam and the emitted light in order to maximize average pulse energy. The phase shifter settings are sensitive to the beam orbit, which is influenced by many slowly drifting accelerator systems including the rf system. Hysteresis is one of the nonlinear effects that require additional tuning of the phase shifter gaps. The magnetic field can differ from the theoretical strength when the gaps are changed from open to closed. Larger beam deviations from the center trajectory in the undulator, especially in vertical plane can have an effect that can be corrected with phase shifters. On top, "wrong" k-values of the single undulator cells (e.g., due to "wrong" taper settings) can be compensated with the phase shifters as well. In the first two experiments we tuned 15 phase shifters demonstrating that we could increase average bunch energy despite relying on a very noisy signal and despite other drifts throughout the machine.
In Fig. 3 we show the evolution of 15 gap settings and the difference between initial and final gap positions following tuning. We left the gaps fixed for the first and final 100 steps of the process, to show the extremely high natural variance of the signal. In Figure 4  FIG. 5. As the machine slowly drifts, the algorithm continues to adjust 5 phase shifter gaps to maximize average bunch energy. the evolution of the noisy average bunch energy and its steady increase over the course of ∼16 minutes as this procedure was iteratively performed at a rate of ∼1 Hz utilizing a 5x moving average of the bunch energy signal. One can see that the pulse energy has many dramatic drops, which were due to fast machine transients. However, looking at the gap position evolution, we see that the gap adjustments did not suffer from any dramatic jumps, as expected, because this method is robust to random noise and because we have analytically guaranteed bounds on parameter update rates as well as hard constraints on all parameter settings.
In Fig. 5 we show a second experiment in which we more than doubled the average bunch energy and only relied on 2x averaging of the very noisy bunch energy signal. We left the gaps fixed for the first and final 100 steps of the process, to show the extremely high natural variance of the signal. Here we see the algorithms ability to respond to machine transients and dramatically readjust gaps to make up for rf/beam energy fluctuations during the tuning process.
Finally, in Fig. 6 we demonstrate the methods robustness as it begins to optimize output energy and then responds to a large step change in rf settings to continue to reoptimize the gap settings. We left the gaps fixed for the first 100 steps of the process, to show the extremely high natural variance of the signal.

C. Air coils
For each undulator section there are 4 air coils which control the beam orbit, directly before the section (CAX and CAY), in both the vertical (y) and horizontal planes (x) and directly after the section (CBX and CBY). The beam orbit is sensitive to rf system settings, launch, undulator and phase shifter gaps, and quadrupole offsets. Figure 7 shows the result on an experiment in which 84 air coils were tuned, 21 x and y before and after each undulator section. This experiment was performed when the SASE level was extremely low and was just initially being set up. The algorithms is seen to converge to a local maximum, after which subsequent tuning of other accelerator settings would take place before returning to the air coils again. Initial and final air coil settings are shown in Fig. 8. One interesting point to note in this experiment is that the beam shut off at around step 325 of optimization. As expected, the air coils did not experience

D. Air coils and gaps
As a final demonstration of the usefulness of this approach for finding and tuning the initial SASE setup, and its ability to handle multiple parameters simultaneously, we tuned all 84 air coils and 21 phase shifter gaps simultaneously with no averaging, using only single shot measurements. The result is shown in Fig. 9, a slow, steady increase in average pulse energy is seen.

IV. LCLS
At the LCLS we demonstrated the algorithm's ability to tune the longitudinal phase space of the beam by adjusting the rf phase and bunch compressor settings. A simplified view of the LCLS FEL is shown in Fig. 10, locations of parameters being tuned labeled in green.
For energy maximization we simultaneously tuned the following six parameters, p ¼ ðp 1 ; …; p 6 Þ: 1. The Linac 1 (L1S) phase set point has influence on both electron bunch energy and the length change due to bunch compressor BC1. L1S drifts continuously due to temperature and periodically has to be corrected via lengthy, invasive phase scans. 2. The Linac 1 X-band (L1X) linearizing cavity phase set point linearizes the electron bunch, compensating for energy offsets introduced by L1S. 3. The bunch compressor 1 (BC1) energy set point determines the amount of longitudinal compression of the bunch and provides feedback for the L1 amplitude set point. 4. The Linac 2 (L2) phase set point controls a group of multiple klystrons, effects bunch length, and suffers from the same drifts as L1. 5. Bunch compressor 2 (BC2) energy set point. BC2 is a second stage of compression at higher energy. 6. Linac 3 (L3) phase set point. L3 is another multiklystron system with phase and amplitude drifts.
To maximize the noisy/stochastic FEL output power, Cðp; tÞ, we tuned our parameters iteratively according to the finite difference approximation of (1): For each of our parameters, we chose upper and lower bounds, p i;max and p i;min and normalized our parameter values to within the bounds ½−1; 1. We then performed the ES update (6) on normalized parameter values, before unnormalizing and entering the physical parameter quantities into the accelerator control system. Normalization was useful in this case because the various parameters have order of magnitude differences in the ranges of their values. During a limited beam time we were able to conduct two experiments in which we optimized the FEL's output power.

A. Experiment 1
We began by gently tuning the parameters with small values of k ¼ 1 and α ¼ 1 in (6) with ω 0 ¼ 2000, fω i g ¼ fω 1 ;…;ω 6 g ¼ fω 0 ;…;1.75 × ω 0 g, and Δ ¼ 2π 20×maxfω i g . The values of ω are chosen to span the range ½ω 0 ; 1.75 × ω 0 so that each component evolves with an independent frequency, so that the frequencies are within a relatively close range of each other resulting in all parameters evolving with similar rates, and to avoid any one parameter evolving with an integer multiple frequency of another so that nonlinearities in the system do not cause overlap in parameter evolution in the frequency domain. Our cost function was a 40× averaged value, that is, we would set parameter values and record 40 consecutive FEL output pulses at a rate of 10 Hz to obtain a cleaner measurement of the output power. We would then perform one parameter update. The results of the first experiment are shown in Fig. 11. ES was able to tune all 6 parameters simultaneously, increasing the average power by 31%. The convergence was very slow because of the small controller gains and the large number of averages.

B. Experiment 2
In the second experiment we wanted to demonstrate was only a 2× averaged value of the output power, that is, we would set parameter values and record only 2 consecutive FEL output pulses at a rate of 10 Hz to obtain a cleaner measurement of the output power. We would then wait 1 second, to allow the physical parameters to settle to their new ES-based set points after performing each parameter update. The parameters were tuned more aggressively with values of k ¼ 10, α ¼ 20, and fω i g and Δ as before. The results of the second experiment are shown in Fig. 12. Again ES was able to tune all 6 parameters simultaneously, and this time at a much faster rate, increasing the average power by 20%. For this second experiment we started at a higher power and did not have sufficient time to study the faster scheme any further, whose optimization would have likely continued to higher power.

V. AUTOTUNE
An autotuned version of the algorithm has been developed as described here. The goal is to choose gains, k j , or dither frequencies, ω j , so that the parameters converge to their optimal values. The choices depend on the sensitivity of the cost function Cðp; tÞ relative to the parameters p j .
We do this according to the following calculations. For parameters being tuned according to dp j dt ¼ ffiffiffiffiffiffiffiffi αω j p cos½ω j t þ k j Cðp; tÞ we are interested in the rate of change of the angle θ j ðp; tÞ ¼ ω j t þ k j Cðp; tÞ, which is As discussed above, this algorithm results in average parameter dynamics of the form: Plugging (9) into (8), and assuming ∂C ∂t is slow relative to parameter dynamics, we get We then conjecture that the best choice of k j is one such that dθ j =dt ≈ 0, so that when the cost function is decreasing the parameter continues to move in the correct direction without changing direction. Therefore we want to solve: There are 3 possible ways to proceed: (1) Choose values for each k j first and then solve each of the Eq. (11) for ω j .
(2) Choose values for each ω j first and then numerically solve the entire system of equations (11) for all of the k j simultaneously. (3) Choose values for each ω j first and then further simplify by assuming that k i ≈ k j for all i and j in (11), from which we get and solving for k j , we get  Although it is not necessary to make the assumption k i ≈ k j , these calculations are all approximations and this simplifies the implementation. In all of the methods above, in order to calculate partial derivatives with respect to parameters one begins optimization by making small changes in each parameter p j one at a time and recording Δp j as well as ΔC at each step, to then get a numerical approximation ∂C ∂p i ≈ ΔC Δp j . Note that there are many problems and limitations with this autotuning approach: (1) If the system is noisy, then these single parameter changes for a derivative estimate must be repeated multiple times and averaged to get useful values, furthermore, in a very noisy case one must be careful to make large enough changes Δp j , so that a measurable difference is seen in the cost function, ΔC, above the noise floor, in order to estimate ∂C ∂p i ; (2) If there are many parameters being tuned this can be a very lengthy process; (3) This result will hold only locally and must be re-calculated as one moves through time and parameter space and the shape of the function Cðp; tÞ changes.
This feature was added to the OCELOT autotuning code in the form of utilizing Eq. (13) to determine initial k j values and this was applied in the EuXFEL. Despite its limitations, this method was found to be useful on several occasions, as shown in Fig. 13 in which the method was applied twice for automatically tuning 4 air coils, resulting in a doubling of average bunch energy over ∼14 minutes. The time required for pulse energy maximization is specific to any particular tuning operation and depends on things such as specific machine configuration, starting energy, and the choice of parameters being tuned. Despite this variation, a doubling of energy by tuning 4 parameters in a matter of 14 minutes, as demonstrated above, is relatively fast for an operation that, when carried out manually can take up to an hour, especially if 4 parameters are tuned one at a time by an operator.

VI. CONCLUSIONS
We have demonstrated a robust, model-independent method for tuning coupled particle accelerator components for maximizing extremely noisy energy measurements in time-varying, drifting free electron lasers, demonstrating the approach in-hardware on two FELs. We have developed a tool for an automatic autotune setup of the algorithm which was successfully demonstrated in the EuXFEL control room for tuning families of 5 parameters. The major strength of this approach is that it is model independent, robust to noise, and can tune many coupled parameters simultaneously. FIG. 13. Tuning 4 air coils to maximize average bunch energy. Left: An autotune was performed to calculate the k j , a zoom in on the initial 4 steps shows the parameters being changed one at a time during the autotune procedure. Optimization was run for 4 minutes. Right: A second autotune was then performed and optimization was run for an additional 10 minutes at which a local maximum was reached.
The next step in this work will be to make the autotuning procedure more robust to extremely noisy signals. It was found that when the measured cost function variance was very high and slowly moving, the auto-tune procedure had difficulty, partly due to an inability to distinguish between its effect on the system and that of random, large, slow drifts. Towards this goal we will have to conduct further analytical studies and perhaps couple this algorithm with machine learning tools to utilize some knowledge of machine parameter sensitivities and long term drift characteristics.

APPENDIX: TUNING ALGORITHM MATHEMATICAL BACKGROUND
We provide mathematical results from [23][24][25] which is applicable to time-varying, nonlinear systems, not affine in control. where f, x, u ∈ R n , Cðp; tÞ ∈ R is an analytically unknown function, andĈ is a noise corrupted measurement of C. We where ω i ¼ ω 0 r i with r i ≠ r j . Consider also the average system with component dynamics: where l i > 0 is a constant which depends on f i (see [25] for details). Then, for any given compact set K ⊂ R n , for any pð0Þ ∈ K, for any time interval ½t 0 ; t 1 , and for any δ > 0, there exists ω ⋆ , such that for all ω 0 > ω ⋆ , the trajectory pðtÞ of (A1)-(A4) and the trajectorypðtÞ of (A5) satisfy the bounds: sup t∈½t 0 ;t 1 kpðtÞ −pðtÞk < δ: Remark 1: Note that the form of (A5) guarantees that the trajectory xðtÞ will approach and track a local maximum of the analytically unknown function Cðp; tÞ, despite only having access to its noise-corrupted measurement C ¼ Cðp; tÞ þ nðtÞ, if kα > 0 is sufficiently large. For minimization, instead of maximization, we simply choose k < 0.
Corollary 1: Consider a system of parameters, p ¼ ðp 1 ; …; p n Þ, an analytically unknown cost function Cðp; tÞ, and its noisy measurementĈðp; tÞ ¼ Cðp; tÞ þ nðtÞ. If the parameters are tuned according to the dynamics: then, for large ω i , the parameter dynamics are, on average, given by: resulting in maximization of Cðp; tÞ despite only having access to noisy samples.