Noninvasively improving the orbit-response matrix while continuously correcting the orbit

Based on continuously recorded beam positions and corrector excitations from, for example, a closed-orbit feedback system we describe an algorithm that continuously updates an estimate of the orbit response matrix. The speed of convergence can be increased by adding very small perturbations, so-called dither, to the corrector excitations. Estimates for the rate of convergence and the asymptotically achievable accuracies are provided.


I. INTRODUCTION
The orbit-response matrix relates changes of the dipole corrector magnets to orbit changes that are observed on the beam position monitor system. It is of paramount importance for maintaining stable beam positions in storage rings, which is typically accomplished by "slow" orbit correction systems [1][2][3][4] and "fast" feedback systems [5,6]. They either use a response matrix generated from a computer model of the accelerator or a measured matrix found by varying one corrector at a time and observing the ensuing changes with the beam position monitor (BPM) system.
As a matter of fact, comparing the measured matrix with a matrix derived from a computer model, as discussed in [7][8][9], makes it possible to track down deficient hardware, such as incorrectly calibrated power supplies or scale errors on position monitors. Usually, the response matrix is measured in dedicated shifts, labeled "machine development," where the excitation of one corrector after the other is varied and the resulting changes of the positions on the orbit monitor system are recorded, which is commonly referred to as "open loop" measurements. In this report, we discuss an algorithm that complements the exist-arXiv:2104.05300v2 [physics.acc-ph] 1 Jul 2021 ing methods. It requires no dedicated beam time and slowly improves an estimate of the response matrix quasi for free by using information from the "closed loop" orbit feedback system. The procedure, based on a recursive least-squares algorithm [10,11], is completely non-invasive and can run while operating the accelerator in production mode-producing luminosity in a collider, or photons in a light source. It has the remarkable property that the error bars asymptotically approach zero as the estimated response matrix approaches the "real" response matrix. The algorithm is, however, slow, because it "learns from noise" but might nevertheless prove useful to continuously improve the response matrix at times normally not accessible for machine improvement. This opens the possibility to track down very slow changes of hardware parameters when post-processing the response matrix with, for example, LOCO [8].
This report is organized as follows: in the next section we develop the algorithm, followed by Section III, where we introduce a simple model storage ring used to illustrate it.
In Section IV we introduce dithering as a way to speed up the algorithm, before we explore its convergence properties, both during the early stages in Section V, and in the asymptotic regime in Section VI. Before concluding, we address a number of technical issues and extensions to the algorithm in Section VII.

II. THE ALGORITHM
The response matrix B with matrix elements B ij relates the change in excitation u j of steering magnet j with 1 ≤ j ≤ m to a change of the beam position x i with 1 ≤ i ≤ n on monitor i. Here superscripts denote different monitors and correctors. We will use the notation from quantum mechanics with bra states denoting column vectors and ket states denoting row vectors, which will prove convenient later on. We thus collectively denote the values of all n BPM by |x and the m correctors by |u . Correcting the orbit then means to to add a perturbation B |u to the orbit |x that minimizes the residual orbit |x after correction. It is given by where |w describes noise in the system, for example, due to ground motion or BPM noise.
When correcting the orbit, we have to find corrector excitations |u that minimize x|x .
One problem is that we do not have complete knowledge of the system matrix B. All we do know is a more or less accurate estimateB that was previously derived from a computer model or from measurements and use that when correcting the orbit.
Assuming that the position monitors report values |x and furthermore assuming that the desired orbit is centered around zero, allows us to calculate the desired corrector excitations |u from invertingB, the approximation of B from Equation 1. IfB is square (n = m) and invertible this is just the matrix-inverse −B −1 , where the minus sign ensures that the effect of the correctors cancels the observed orbit. IfB is over-determined (n > m) this is accomplished by the Moore-Penrose pseudo inverse −(B B ) −1B , which follows from minimizing |x +B |u |x +B |u with respect to |u . IfB is under-determined (n < m) it can be inverted using singular value decomposition. In general, we denote the linear dependence of the corrector excitations on the observed orbit |x by the "correction" matrix K, such that |u = −K |x .
Our task is now to extract information from repeatedly correcting the orbit and correlating the orbit change with the used corrector changes |u . To this end we note that the noise |w and the mismatch of the "real" accelerator model B andB from which K is derived, causes the correction to be imperfect. We model this dependence by the dynamical system where the subscript t denotes a discrete time step from one iteration of the orbit correction to the next. We assume that the noise |w t is Gaussian and characterized by the expectation value E{|w s w t |} = σ 2 w Cδ st . Here σ 2 w C is the spatial covariance matrix, where σ w is the rms magnitude and C describes correlations among different BPM. In Appendix D we will return to the general case, but assume C to be a n × n unit matrix in the main text. Furthermore, δ st is the Kronecker delta, which implies that we treat noise to be uncorrelated from one iteration to the next. Note also that the effect of power supply noise |ṽ t added to |u t is equivalent to additional noise on the monitors with magnitude B |ṽ t . In Equation 2 we implicitly omit fast time-dependent effects, such as latency in the power supplies or the computation chain as well as the effect of eddy currents. In Section VII we briefly discuss how to include these effects, but in the main text all transient effects are assumed to have settled to a new equilibrium from one iteration to the next. Now the interpretation of Equation 2 is straightforward: the system responds with the "real" response matrix B to a change of the corrector excitation by |u t that was calculated with the approximative inverse K and the orbit |x t . At the same time, noise enters the system through |w t , such that the residual orbit |x t+1 after the correction is not necessarily equal to zero. Iterating the orbit correction, which is what orbit feedback systems essentially do, can now be modeled by iterating the system described by Equation 2.
In order to find an estimateB of the system matrix B one row-corresponding to a particular BPM i-at a time, we construct linear systems of equations for each time step and solve the resulting sequence of equations with a recursive least-squares algorithm [10,11]. To set up the equations, for the time being, we ignore the noise |w t and formulate Equation 2 for this BPM as a constraint forB. Writing the constraints over consecutive readings x i s , we find where the second equality follows from exchanging the order of writing the scalar product of row i ofB and corrector excitations u j s . In the next step we assemble multiple copies of this equation from different times 1 ≤ s ≤ T in the form of a matrix and denote the matrix containing the corrector excitations u j s by U T , which thus contains the excitations of all correctors stacked one by one on top of the other. Likewise, the vector on the left-hand side contains the orbit differences that each of the steering magnet excitations causes. If we now record BPM positions and corresponding corrector excitations for a long time T , the system of equations in Equation 4 is vastly over-determined, provided that the noise really affects all possible degrees of freedom of the system, which implies that in general the covariance matrix C must have full rank. Since we assume C to be the unit matrix, this is the case and we can solve Equation 4 in the least-squares sense with the pseudo-inverse mentioned above. We find . .
which provides an estimate for row i of the matrixB i: T after T iterations of the orbit corrections. Repeating this procedure for all BPMs provides us with an estimate for the complete system matrixB T .
In passing, we point out that Equation 5 describes a linear map from the vector with the position differences on the right-hand side onto the vector with row i ofB T , which allows us to calculate an empirical (data-driven) covariance matrix of theB T from the covariance matrix of the position difference, which is 2σ 2 w times the unit matrix. The error bars σ(B) of the fittedB T are therefore approximately given by the square root of the diagonal elements of 2σ 2 w U T U T −1 , which will prove useful later on.
Calculating the pseudo-inverse of U T for more and more iterations becomes numerically very expensive. There is, however, an elegant way of iteratively updating the pseudo-inverse using the Sherman-Morrison [12] formula. It is based on updating P T = U T U T −1 andB T as the matrix U T grows one row at a time by adding the row vector u T +1 | = (u 1 T +1 , . . . , u m T +1 ) to it. This entails that we can write P −1 T +1 = P −1 T + |u T +1 u T +1 |. In Appendix A we show that its inverse is given by With P T +1 = U T +1 U T +1 −1 known, we can calculate an updated approximation [13] of the response matrixB T +1 from We refer to Appendix B and [13] for the derivation. Note that in Equations 6 and 7 the right-hand sides only depend on P T andB T from the previous iteration, the new corrector excitations |u T +1 , and the most recent change in the orbit |x T +2 − |x T +1 . These two equations now allow us to continuously update the response matrix while correcting the orbit with K. All we do here is correlating the change of the orbit |x T +2 − |x T +1 with the corrector excitations |u T +1 that cause this change and then update our approximation of B in the process.
With the basic algorithm worked out, we simulate its performance in the following section.

III. SIMULATION
In order to test the algorithm we prepared response matrices for a small ring consisting of ten FODO cells each having phase advances of µ x /2π = 0.228 and µ y /2π = 0.238 in the horizontal and vertical plane respectively. The tunes of the ring therefore are Q x = 2.28 and Q y = 2.38. Moreover, there are two 18-degree sector dipole magnets in each cell. The beta functions of one cell are shown on the left-hand side in Figure 1. We place a corrector and a BPM at the same location as the (thin-lens) focusing quadrupole, which then accounts for ten correctors and ten BPM, each. In order to keep the simulation transparent, we calculate the response matrix B between these correctors and BPM in the horizontal plane only. Most of the simulations are done for equal numbers of correctors and monitors; we address other cases in Section VII. The response matrix derived from this unperturbed ring is the "ideal" response matrixB that we use to derive the correction matrix K = B B −1B to correct the orbit. In order to simplify the theoretical analysis in Section VI, in the remainder of this report we confine ourself to a constant correction matrixB. If, instead we were to use the constantly updatingB T for the correction, the algorithm would be adaptive. In order to determine the "real" response matrix B, we randomly vary the focal lengths of the quadrupoles with a rms of 5 % and re-calculate the response matrix B for the perturbed ring. We take notice that the rms magnitude of the response coefficients is 6.6 m/rad. In order to quantify the estimation error b T =B T − B after T iterations, we introduce the rms value of b T , calculated over all matrix elements as discrepancy |b T | rms . It can also be calculated from Evaluating the initial value |b 0 | rms for our model storage ring, we find that it is approximately 0.3 m/rad which accounts for a 5 % rms deviation of the response matrix coefficients. The simulations are based on Matlab scripts that use beam optics functions from [14]. The code illustrating one iteration of the algorithm is reproduced and commented in Appendix C.
Running the simulation for 10 5 iterations, which takes a few seconds on a desktop computer, produces Figure 2, which shows the evolution of the rms orbit σ 2 x = x|x and the discrepancy |b T | rms between the "real" and the estimated response matrix. We initialized the estimate forB 0 with the response matrix for the ring without quadrupole gradient errors and P 0 with the unit matrix. The rms amplitude of the noise σ w was chosen to be 0.1 mm. Note that the upper plot, which shows the rms orbit σ x for the duration of the simulation clearly verifies this; the mean is close to 0.1 mm. At the same time, the discrepancy |b T | rms , shown on the lower plot, is approximately halved to a final value |b f | rms = 0.168 m/rad, which shows that the algorithm works.
Repeating the same simulation (always for 10 5 iterations) for different values of σ w and recording the final discrepancy |b f | rms produces the plot shown in Figure 3. Here we find that increasing noise levels are beneficial for the rate of convergence, up to about σ w = 0.5 mm, where the induced changes in b during one iteration become comparable to the magnitude of b. We need to stress that the plotted values are those reached after 10 5 iterations. They are not the asymptotic levels.
The algorithm is rather stable. We ran simulations where we initializedB with random matrices or other made-up starting guesses. The algorithm, after an initial transient period, always converged towards the "real" matrix B.
We point out that the convergence depends on the noise level, where more noise moves the correctors around more and actually improves the convergence, but the rate is still rather slow, on the order of several 10 5 iterations, which would correspond to about three hours real time, provided that the feedback operates at an update rate of 10 per second. Moreover, the asymptotically achievable discrepancy is of considerable interest. We will address these topics below after having introduced the effect of additional corrector perturbations.

IV. DITHERING
Varying the corrector excitations one at a time, either systematically or sinusoidally [15][16][17], in order to determine the response matrix is used in practically all accelerators. Moreover, continuously varying correctors very little such that the detrimental effect on the orbit is negligible, so-called dithering, was successfully used [18][19][20] to optimize the performance of a number of accelerators. We implement dithering in our simulation by adding a perturbing vector |z t to |u t when correcting the orbit in Equation 2, which therefore becomes The rest of the simulation remains unaffected; any changes of |z t and consequently of |u t are consistently accounted for in the updates of P T andB T in Equations 6 and 7.
In the simulations, shown in Figure  was turned on between 20000 and 40000 iterations. We clearly see that the rms orbit increases from 0.10 to 0.16 mm during this period, which is consistent with expectations, because the rms value of the B of 6.6 m/rad and 20 µrad additional excitation results in an additional rms orbit variation of 0.13 mm, which, added in quadrature to σ w = 0.1 mm, gives about 0.16 mm. We also observe on the lower plot that the discrepancy |b T | rms is significantly reduced and conclude that temporarily adding dithering helps to improve our knowledge of the response matrix. Note that no additional processing of the data is necessary. The algorithm learns whenever it gets the chance to observe some variation, never mind the source of the perturbation. Remarkably, a slammed door might be beneficial for something.
In the simulation shown on the right-hand side in Figure 4, we keep the 20 µrad roundrobin dithering on permanently and observe that the rms orbit is 0.16 mm throughout the simulation, while the discrepancy |b T | rms is reduced sevenfold. Again, no special processing is required.
The left-hand plot in Figure 5 illustrates the effect of dither amplitude, shown on the horizontal axis, on the rms orbit (dashed black) and on the discrepancy (solid red). We clearly observe that the increasing dither amplitude increases the rms orbit σ x , but at the same time, helps to reduce the discrepancy |b f | rms . Closer inspection shows that a dither amplitude of 16 µrad contributes to σ x with the same magnitude as normal noise level σ w .
This causes σ x to increase by 40 %. At the same time, |b f | rms is reduced by 1/3 from 0.168 m/rad to 0.056 m/rad. This configuration is indicated by the vertical dotted line in Figure 5.
The right-hand plot in Figure 5 shows the data from the left-hand plot, but now plotting the discrepancy |b f | rms versus the the rms orbit σ x (solid red) and compares it to the data from Figure 3 (dashed black). Unsurprisingly, increasing σ x by dithering reduces |b f | rms more efficiently than just increasing the natural noise level σ w .

V. CONVERGENCE
A matter of practical interest are the time scales, given by the number of iterations, before we observe some improvement of the response matrix. We point out that the results devel-oped in the following sections apply to all systems described by Equation 2, which includes rings with transverse coupling and correction matrices K that use elaborate regularization schemes. The simulations, which are based on correctors and monitors in a single transverse plane, are only used to illustrate the general results. Let us start by analyzing the initial behavior of the discrepancy and approximate Equation 6 by replacing |u T +1 u T +1 | by its expectation value E{|u T +1 u T +1 |}, which asymptotically becomes independent of T . We therefore use E{|u T u T |} instead, which depends on |x T via |u T = −K |x T and calculate where the second equality results from iterating the first equality. Since the spectral radius ρ(Λ) with Λ = 1 − BK is much less than unity, the influence of the initial |x 0 "dies out" for large T and we can omit the first term from the sum.
we obtain where we used that the expectation value of the Gaussian noise is E{|w s w t |} = σ 2 w δ st 1. Moreover, o(1) denotes a quantity that vanishes in the limit of large T . The smallness of ρ(Λ) implies that only the term with s = 0 in the sum in Equation 10 contributes and we We include round-robin dithering with amplitude z through the m correctors by adding a term z 2 m 1, because dithering is uncorrelated to the noise and after m iterations dithering contributes a unit matrix. We thus just "spread out" this unit matrix to the individual iterations when diving by m. We therefore introduce to represent the average effect of the orbit correction and dithering when updating the "averaged"P T in Equation 6, which then readŝ Note that Equation 12 is a deterministic equation that describes the averaged updating ofP T . In the simulation we updateP T in parallel to its "stochastic brethren" P T and find that they are extremely close, both with and without dithering. The upper panel in Figure 6 shows an example with σ w = 0.1 mm and z = 20 µrad, which corresponds to the configuration also displayed on the right-hand side in Figure 4. The solid black curve is produced by a numerical simulation with simulated random noise and the dashed red curve shows the result of the deterministic simulation, based on Equations 11 and 12.
We point out that Q is the only parameter in the dynamics described by Equation 12. In order to simplify the analysis somewhat, we neglect the trace in the denominator, which is practically always much smaller than 1, which results in 1 + Trace QP T ≈ 1 and allows us to write the equation asP T +1 =P T −P T QP T . Moreover, Q is symmetric by construction and we can choose a coordinate system in which Q is diagonal with eigenvalues λ j = σ 2 w σ j +z 2 /m, where σ j are the eigenvalues of KK , such that Q = ODO with D = diag(λ 1 , . . . , λ m ) and an orthogonal matrix O. Also the starting guess for P 0 is the unit matrix and is diagonal, such that Equation 12 can be written as m independent equations for each of the m diagonal elements x j,T ofP T . Each eigenvalue thus corresponds to one mode that describes the dynamics of the convergence process. In the following, we consider one mode at a time and omit a second index j = 1 . . . , m from x and λ to make the equations easier to read.
We therefore obtain x T +1 = x T − λx 2 T or its continuous approximation dx T /dT = −λx 2 T for each mode. This equation has the solution Numerically x 0 has the value of unity, becauseP 0 is the unit matrix, but we leave it in place to keep track of the units of x 0 which are 1/mrad 2 . We thus find that the inverse eigenvalues 1/λ of the matrix Q determine the time scales of the convergence of the process. Note, however, that the time dependence is inversely proportional to T , rather than exponential, and is therefore slow. It remains to analyze the time scales of the convergence of the values ofB T to B, which is described by Equation 7. We note that |x T +2 − |x T +1 = B |u T +1 was caused by the corrector values |u T +1 , such that we arrive at where we subtracted B on both sides. We now replace |u T +1 u T +1 | by its expectation value and therefore use Q from Equation 11 to arrive at where we introducedb T =B T − B to simplify the writing. Like Equation 12 before is this a deterministic equation forb T that we update in parallel to the stochastic simulations that generate b T . On the bottom panel in Figure 6 we show |b T | rms , the rms value of b T , as a solid black line and |b T | rms as dashed red line for a simulation with parameters specified in the figure caption. We take notice that both black and red curves track one another very well, which allows us to determine the time scales from analyzing Ξ T from Equation 15. As before, we use a coordinate system in which Q and P T are diagonal, ignore the denominator with the trace, and analyze one mode at a time. If we denote the eigenvalue of Ξ T by ξ j (and omit the index j henceforth, because we consider one mode at a time and want to use the subscript to denote the iteration), we find where we substituted x T from Equation 13. Again the time scales are determined by 1/λ, the inverse eigenvalues of Q. Inspecting Equation 15, we see that the eigenvalues Ξ s describe how the modes decrease from one iteration T − 1 to T . In order to find the total reduction after T iterations we need to multiply all the previous eigenvalues ξ s for 1 ≤ s ≤ T , which gives us the eigenvalues y T of the product Y T = T s=1 Ξ s where the last equality is straightforward to prove by induction. Thus Y T is a diagonal matrix with expressions 1/(1 + x 0 λT ) along its diagonal. If we now rewrite this equation in non-diagonal coordinates, we obtain the matrix G T = OY T O that maps the initialb 0 tob T and λ j = σ 2 w σ j KK + z 2 /m without iterating through all the intermediate steps. In passing, we point out that G T behaves like a transfer function that maps the initialb 0 to a later valueb T . Iterating with, for example, different dither amplitudes z only involves left-multiplying with different G T , each one calculated with the appropriate z. shows |b T | rms using the deterministic iteration, while the blue dots are calculated with the matrix G T . We observe that all three curves track one another very well. The plot on the right-hand side in Figure 8 shows the configuration with 20 µrad dithering added, already used in Figure 6 with the blue dots from the analytic calculation superimposed. Again, the agreement is rather good, though some discrepancies show up, once |b T | rms becomes very small. Let us therefore analyze this late regime more carefully.
From the discussion in Section 2 we know that 2σ 2 w P T = 2σ 2 w U T U T −1 is a data-driven approximation of the covariance matrix for the matrix elements ofB T . We therefore heuristically approximate the error bars by σ(B) = 2|P T | rms σ w and show |b T | rms for a numerical simulation (solid black) and the deterministic average (red dashes) as well as σ(B) (blue dash-dots) in Figure 9. We observe that once |b T | rms becomes smaller than σ(B) the nu- dominating factor for the rate of convergence. We therefore need to address the asymptotic regime separately, which is the topic of the next section.

VI. ASYMPTOTICS
The asymptotic regime is characterized by the discrepancy |b T | rms being smaller than the error bars, or heuristically; the signal |b T | rms is inside the noise floor. We saw in the simulations shown on the figures that even in this regimeB T converges towards the "real" response matrix B. If we focus on cases without dithering (z = 0), we can explore this further by exploiting a theorem by Lai and Wei [21], which states that where σ min [·] and σ max [·] denote the smallest and largest eigenvalue of the matrix in the argument, respectively. | · | ∞ denotes the largest value of the matrix in the argument, which is always larger than the rms value of all matrix elements that we used in the previous sections; the two values only differ by a numerical factor of order unity. The symbol O(·) denotes the leading order in T, and P −1 |u t u t | was defined earlier. We therefore need to determine the scaling of T t=1 |u t u t | and its smallest and largest eigenvalues with T .
To do so, we note that the system, defined by Equation 2, can be written as |x t+1 = (1 − BK) |x t + |w t , which shows that the time step t + 1 only depends on parameters at time t, which makes it a Markov chain. Moreover, if the closed-loop system is stable, the spectral radius ρ(Λ), with Λ = 1 − BK, is strictly less than unity, which causes the process to forget all uniformly bounded initial conditions sufficiently fast. This makes the corresponding Markov chain uniformly ergodic and implies that the time-average and the average over the distribution function of the noise, the expectation value E {·}, are the same where, as before, o(1) is an expression that vanishes in the limit of large T . The righthand side of Equation 20 we already calculated in Equation 10 and turn to its asymptotic behavior, which is encapsulated in the limit of Γ T = T −1 s=0 ΛΛ s for large T . First we note that is finite. Second, the existence can be proven by noting that Γ T is a Cauchy sequence; Γ T − Γ T = o(1) for large T, T . We can therefore introduce Γ = lim T →∞ Γ T and obtain This expression allows us to determine the smallest and largest eigenvalue of the left-hand and likewise for σ max . We note that the smallest and largest eigenvalues of a matrix are continuous functions of the matrix elements. This implies-as a consequence of the continuous mapping theorem [22]-that limits of these functions are preserved, even if the matrix elements depend on random variables. We therefore obtain from Equation 20 For the asymptotic approach of the estimateB T towards the "real" response matrix B we insert the eigenvalues in Equation 19 and find where we did not spell out constant factors. In passing we note that P T = U T U T −1 scales with 1/σ min P −1 T and this leads to which decreases like 1/T in the leading order.
In order to verify the asymptotics numerically we run simulations with σ w = 0.1 mm for 5 × 10 6 iterations. Figure 10 shows In order to explore this variability we run the simulation with 400 different random seeds, all having σ w = 0.1 mm, and plot the final value of the discrepancy |b f | rms , the slope of Trace P T P T , and the slope of |b T | rms in the top row of histograms in Figure 11. We see that after 5 × 10 6 iterations |b f | rms has reached a value of about 12 mm/rad (left). The slope of Trace P T P T is −1.92 (center) and has not quite reached its asymptotic value of −2.
The asymptotic slope of |b T | rms (right) is approximately 0.75. The width of the histograms indicate their standard deviations, which is indicated as the uncertainty in the respective legends of the plots. We observe that the results are reasonably stable and give a good indication of the asymptotic behavior of the system. In the bottom row in Figure 11 we show the corresponding plots for the situation, where 20 µrad round-robin dither is added.
We find that the final value of |b f | rms is only 4 mm/rad (left), while the slope of Trace P P is very close to the asymptotic value of −2. The slope of the discrepancy |b T | rms (right) indicates a value of approximately −0.53. We point out that the width of the two histograms on the right is much larger than the others, which we again attribute to the logarithm in the numerator of Equation 25.

VII. SOME TECHNICAL ASPECTS
We now turn to practical aspects of our system to determine the "real" response matrix B. From Equations 25 and 26 we see that the most important quantity for convergence is the smallest eigenvalue of KΓK , where Γ = ∞ s=0 ΛΛ s is defined immediately before Equation 22. For all well-behaved feedback systems ρ(Λ) = ρ(1 − BK) is much smaller than unity and the term with s = 0 dominates the sum, which makes Γ very close to the m × m unit matrix. Since we do not a priori know B, we just set Γ to the unit matrix when evaluating the performance of our system and consider KK alone.
If the feedback system is equipped with more correctors than position monitors (n < m), the matrix KK is degenerate a has a null eigenvalue, which spoils the convergence. The left-hand plot in Figure 12 shows what happens when we remove one row, corresponding to one position monitor, from the response matrix and repeat the analysis. The orbit, shown on the upper panel is still corrected with a rms value comparable to σ w , but |b T | rms , shown on the lower panel, no longer converges to zero. The identification of the response matrix only works partially and a finite difference to the "real" B remains.
If, on the other hand, there are more position monitors than corrector magnets (n > m)in the simulation we removed one column, corresponding to one corrector magnet, from the response matrix-the identification of the response matrix works well, as illustrated on the lower panel on the right-hand plot in Figure 12, because m × m matrix KK has full rankno null eigenvalues. On the other hand, we can no longer correct the orbit, as shown on the upper panel, because now the m × n matrix K now has eigenvalues null. We can, however, remedy this problem by decomposing the symmetric n × n matrix K K = ODO , where D is a diagonal matrix containing the eigenvalues d i and O is an orthogonal matrix, whose columns are the corresponding eigenvectors |o i . We note that Φ = i in nullspace |o i o i | is a projection matrix onto the null-space of K K, such that Ψ = 1 − Φ projects onto its orthogonal complement, which is the space of BPM readings that the correctors can actually affect. If we use Ψ |x instead of |x when we apply the correction, the null-modes never pile up and become unstable. If we apply this method to the example from the right-hand side in Figure 12, the orbit in the upper panel looks very similar to the one on the left-hand plot. Since we always know K (as opposed to B, which we do not know), we can always construct Ψ. Using the projector Ψ we can also use our algorithm if there are more BPM than correctors.
For one-to-one orbit correction feedback systems with equal number of position monitors and correctors (n = m) we just have to evaluate the eigenvalues of KK and possibly adjust K by hand in order to speed up the convergence, albeit at the expense of compromising the orbit correction to some extent. The details depend on the particular accelerator and we will not dwell on this point further.
In order to understand the scaling of the convergence with system parameters, we consider rings with increasing number of n = m cells with equal phase advance that contain one corrector and one BPM, each, which results in a near-circulant response matrix [23]. In numerical experiments we find that the largest eigenvalue of B B approximately increases with n 2 . Since the correction matrix K is normally close to the pseudo-inverse of B, we expect the smallest eigenvalue of KK to have an inverse dependence on n 2 . Moreover, B is proportional to a typical value of the beta functionβ in the ring, which makes K ∝ 1/β, such that we find σ min KK ∝ σ w /nβ . We see that all eigenvalues decrease with x 0 /(1 + x 0 λ i T ), albeit at a slow time scale, characterized by the eigenvalues λ i of Q. This process continues until the asymptotic regime is reached, as discussed near the start of Section VI. In the asymptotic regime P T continues to decrease as specified by Equation 26. We conclude that the error bars always get smaller and do so without limit. Additionally, Equation 25 implies that the approximationB asymptotically approaches the "true" response matrix B.
Finally, extending the algorithm to include settling time τ , processing delay d, and relaxation into a new equilibrium with time scale τ d is straightforward by introducing unobservable state variables |α t and |β t . Their dynamic behavior is described by The delay d and time constants τ and τ d affect the stability of the closed-loop system, but we assume that the feedback designer has chosen K to ensure its stability. In the equation, |α t corresponds to the field inside the vacuum chamber that the beam actually "sees" and |β t , for example, the damping due to synchrotron radiation.
The observable beam position |x t then updates with |x t+1 = |x t + |β t . We note that the left side of Equation 27 enables us to uniquely determine the |α t from the |u t , which makes them quasi observable, provided τ and d are known. Moreover, we find the |β t = |x t+1 −|x t from the |x t , which turns the right side of Equation We observe that this equation has the same form as Equation 3 from the main text with one component of the left-hand side taking the place of x i s+1 − x i s shifted by one time step. Likewise, |α t takes the place of |u s . The analysis from the report up to Equations 6 and 7 remains valid, but analyzing the convergence and the asymptotics goes beyond the scope of the present report.

VIII. CONCLUSIONS
We applied standard system identification techniques, based on recursive least-squares methods, to determine the response matrix in parallel to correcting the orbit in a storage ring.
Simulations show that the method works well, though it is rather slow and requires a large number of iterations. The speed can, however, be increased significantly by systematically adding small perturbations to the corrector magnets, so-called dithering. In this way a small deterioration of the orbit quality can be balanced with the desire to determine an accurate response matrix. We found that the convergence ofB T to the "real" response matrix is governed by the eigenvalues λ of the matrix Q from Equation 11 and we solved the time dependence of the discrepancyb T =B T − B with some approximations. We found in Equation 18 thatb T scales with 1/T , but only until the magnitude ofb T becomes smaller than the error bars of the fitting process, which scale with 1/ √ T . Once inside the noise level, we found that the asymptotic behavior of the convergence has a log(T )/T dependence and is governed by the smallest eigenvalue of KK . In particular, both the error bars of the approximationB T and the difference betweenB T and the "real" B tend to zero in the limit of large T . Furthermore, we found that those feedback systems with number of BPMs equal or larger than the number of correctors (n ≥ m) permit us to simultaneously stabilize the orbit and to identify the response matrix B.
Several extensions of this work come to mind. First, optimizing the correction matrix K such that the smallest eigenvalue of KK is as large as possible without spoiling the orbit quality σ x . Second, comparing different correction strategies, for example, deriving K from "optimal control" quality measures that put a weight both on the orbit σ 2 x and the rms corrector excitation. Third, finding an optimal strategy to make the dither amplitude z time-varying, such that global measure of performance that balances orbit correction and system identification is minimized. The regret, studied for instance in [24], may serve as an example.

Acknowledements
This work was supported in part by the Swedish Research Council (grant 2016-00861), and the Swedish Foundation for Strategic Research (Project CLAS).

Appendix A: Sherman-Morrison formula
Here we show that P T +1 is given by Equation 6 if its inverse P −1 T +1 is given by To show this, we explicitely calculate P −1 T +1 P T +1 and show that it evaluates to the unit matrix and we can use Equation 6 to update P T with the new information that is encoded in the new corrector excitations |u T +1 . Note that P T = (U T U T ) −1 and its inverse are symmetric by construction for all T . This implies that the order of multiplication of P T +1 and its inverse does not matter and we also have P T +1 P −1 T +1 = 1.
Here we introduce the abbreviation y i T = x i T +1 − x i T , exploit that U T +1 = (|u 1 , . . . , |u T +1 ), and finally express P T +1 through Equation 6. In the next step we multiply the two square brackets and obtain four terms where, according to Equation 5, we identify the estimate in the previous iteration T as Combining the second and the fourth term, we arrive at Taking the transpose of this equation and stacking the rows on top of each other then leads to Equation 7.
Appendix C: Code for one iteration The following function receivesB and P , as well as the recently measured orbit |x and the dither vector |z as input and returns the updated matricesB new and P new as well as the orbit |x new after the correction is applied. Inside the function, first the externally defined noise level σ w , a constant correction matrixB, the "real" response matrix B, and the correction matrix K are supplied as global variables. Next, using K, the new corrector values |u are calculated, dither |z is added to the correctors, and the new orbit |x new is calculated, including the noise |w , here implemented as normally distributed random numbers. Then the auxiliary quantity u T +1 | P T is stored in the variable tmp and the inverse of the denominator in the last term in Equation 6 is calculated. The next two lines are straight implementations of Equations 6 and 7.

Appendix D: Spatially correlated monitor noise
The assuption that the noise |w t of position monitors is uncorrelated, is easily relaxed and in this appendix we show spatially correlated noise, characterized by E {|v t v s |} = σ 2 w δ ts C affects the rest of the results, where σ 2 w C is the covariance matrix of the noise. Its matrix elements on the diagonal σ 2 w C ii describe the square of the error bars of BPM i and the offdiagonal elements describe the correlations among different BPMs. Note that we separated the magnitude of the noise (σ 2 w ) from the correlations, where C is a positive definite and symmetric matrix with matrix elements of order unity.
Since C is symmetric we can decompose it into orthogonal matrices O and a diagonal matrix. Since it is positive definite, all its eigenvalues are positive and we can write the diagonal matrix as the square of another diagonal matrix D We will now use this representation of C to transform the dynamical system represented by Equation 2, but with correlated noise |v t |x t+1 = |x t + B |u t + |v t .
and multiply it with D −1 O from the left, which results in With the transformed variables Equation D3 reads where we have We find that this system is equivalent to the one from Equation 2, such that we can directly use the methods developed in the main body of this report. We only need to undo the transformation from Equation D4 in the end.
If we apply this procedure to Equation 6 and 7 we find that these equations are unchanged; the improvement of theB does not depend on the noise as long as there are perturbations.
Only the changes of the controller |u t and the resulting orbit changes matter.
The correlation matrix C does, however, affect the convergence of the algorithm. Using correlated noise |v t instead of |w t in Equation 10, we find that its last equality becomes Following the reasoning from the main body, the term with s = 0 is dominant, which gives us E{|u T u T |} = σ 2 w KCK and the matrix Q from Equation 11 becomes Q = σ 2 w KCK + z 2 m 1. With this version of Q the conclusions of Section V remain the same.
Also the asymptotic behavior is affected by C. Equation 21 becomes where ρ(C) is the spectral radius of C. This results in a slight redefinition of Γ = lim T →∞ Γ T which still is finite, which renders the remainder of the Section VI valid.