Estimation of longitudinal bunch characteristics in the LHC using Schottky-based diagnostics

The Large Hadron Collider (LHC) Schottky monitors have been designed to measure various parameters of relevance to beam quality, namely tune, momentum spread, and chromaticity. In this work, we present how this instrument can be used to estimate longitudinal bunch characteristics, such as longitudinal bunch profile or synchrotron frequency distribution. Under the assumption of bunched beams with no intrabunch coherent motion, we start by deriving the relation between the distribution of synchrotron amplitudes within the bunch population and the longitudinal bunch profile from probabilistic principles. Subsequently, we fit the cumulative power density of acquired Schottky spectra with the underlying distribution of synchrotron amplitudes. Finally, the result of this fit is used to reconstruct the bunch profile using the derived model. The results obtained with this method are verified by comparison with longitudinal profile measurements from the LHC wall current monitors.


I. INTRODUCTION
The Large Hadron Collider (LHC) transverse Schottky system, whose main objective is to provide the beam operators with noninvasive bunch-by-bunch tune and chromaticity measurements, was commissioned in 2011 [1]. In the meantime, the system has undergone major upgrades in order to improve signal quality [2]. Although qualitatively its chromaticity estimates agree with other measurement techniques (as verified in dedicated experiments), the small residual quantitative discrepancies observed need to be understood [3]. Studies are therefore underway in order to better understand the obtained spectra. Due to its simpler physical interpretation, we have focused on the spectral region in the immediate vicinity of the 427725th revolution harmonic, where transverse particle motion plays no role. As by design the difference signal (rather than the sum signal) between two opposite slot-coupled waveguides is used, the inevitable common mode leakage allows us to use this spectral region for longitudinal studies.
As a result, a new application for the system has emerged, as a bunched-beam longitudinal profile monitor. In this work, we derive a relationship between the Schottky spectrum and the longitudinal bunch profile, under the assumption of no intra-bunch coherent motion. A similar result, which uses the distribution function of the radio frequency (rf) bucket instead of the bunch profile, was presented in [4]. We also describe an alternative to the [5] method of obtaining the synchrotron frequency distribution. Finally, we provide a framework for the fast simulation of Schottky spectra, which can be used for further studies. The presented theory is supported by examples of LHC measurements and the obtained bunch profiles are compared with the wall current monitor (WCM) [6].

II. LONGITUDINAL BUNCH PROFILE
Within a bunch, the rf phase difference, Δϕ rf , between a given particle and the synchronous particle, obeys equation [7] [Eq. (9.51)]: where Ω s 0 is the nominal synchrotron frequency and ϕ rf s is the rf phase of the synchronous particle. At constant beam energy, and for hadron machines such as the LHC where the energy loss per turn is small when compared to the maximum rf power, i.e., ϕ rf s ≈ 0, this equation reduces to the pendulum equation For rf harmonic h, revolution frequency ω 0 and time amplitude (maximum time difference between a given particle and the synchronous particle) of synchrotron oscillationsτ, we have that the particle's synchrotron frequency is given by: where hω 0τ ¼ d Δϕ rf is the rf phase amplitude of synchrotron oscillations and Kð½0; 1Þ → ½π=2; ∞ is the complete elliptic integral of the first kind [ [8] p. 590]. This comes from the general theory of an arbitrary-amplitude pendulum [9].
From [10], we know that the time difference τ between a particle performing synchrotron motion and the synchronous particle can be approximated by a simple harmonic motion with amplitude-dependent Ω s , i.e., where Ω s ¼ Ω s ðτÞ. The validity of Eq. (4) can be confirmed by comparing it to the numerical solution of Eq. (1) and noting that Δϕ rf ¼ hω 0 τ. In the case of the LHC, Eq. (4) represents a good approximation for bunch lengths ≲80% the size of the rf bucket, as illustrated in Fig. 1. The size of LHC rf bucket is 2.5 ns with the nominal 4-sigma bunch length less than 1.5 ns [11]. The longitudinal bunch profile can be interpreted as the probability distribution of τ. We shall denote this distribution by BðτÞ. The assumption of no coherent intrabunch motion implies that the distribution of initial synchrotron phases, ϕ s , is uniform and independent of the distribution of synchrotron amplitudesτ. Furthermore, under stationary conditions, the longitudinal bunch profile is independent of time. Therefore, the probability of finding any particle with time difference τ with respect to the synchronous particle can be written as a function of its amplitude of oscillationτ only. We can then write: where g τ;τ ðτ;τÞ is the joint probability density of a particle having amplitudeτ and time difference τ. The second equality comes from the fact, that g τ;τ ðτ;τÞ ¼ 0 for jτj >τ.
The derivation of g τ;τ ðτ;τÞ is presented in the Appendix and yields This expression can be interpreted in an intuitive way. Let us first recall that BðτÞ is the probability density of having particles in the bunch with a time difference τ with respect to the synchronous particle. If we now think of as the probability density that a single particle, oscillating with amplitudeτ, is found at a time distance τ from the synchronous particle, then all we have to do is to multiply it by the relative amount of particles, gτðτÞ, which oscillate with that same amplitude, and integrate over all possible amplitudes to finally obtain the longitudinal bunch profile, BðτÞ. Figure 2 illustrates precisely the relation between BðτÞ and gτðτÞ for different kinds of amplitude distributions. Note that even though τ ¼ 0 is the minimal point of every amplitude contribution, it is also the only point where all the amplitudes contribute. Equation (5) allows us to calculate the longitudinal bunch profile knowing the distribution of synchrotron amplitudes. Conversely, if we are given the bunch profile, we can extract gˆτ by numerically solving an integral equation [12]. In addition, as the synchrotron amplitudes are related to the synchrotron frequencies by Eq. (3), knowing one of these distributions allows us to determine the other two, as shown in Fig. 3.

III. SCHOTTKY SPECTRUM
The Schottky spectrum is composed of a series of Bessel satellites (J p ) of finite width due to the presence of many particles with different synchrotron frequencies. The intensity signal due to a single-particle i, in the vicinity of the n-th revolution harmonic, can be written in the following form [10] Based on the above equation, we can deduce, that a singleparticle power spectral density (PSD) is deterministic, and assuming that all macroscopic parameters (n; ω 0 ; Ω s 0 Þ are set, it depends only on particle's synchrotron amplitudeτ i . It is so, because synchrotron frequency can be derived from synchrotron amplitude using Eq. satellites are evenly and symmetrically spaced, separated by the distance of Ω s i . The power of the satellite AEp is given by the value of J 2 p ðnω 0τi Þ. If we have many particles with different synchrotron amplitudes, the instantaneous Schottky spectrum does not look as simple anymore, see bottom plot of Fig. 4. Depending on the value of nω 0τi , each particle contributes in its own way and in different frequency ranges (the value of J 2 p ðnω 0τi Þ converges monotonically to zero with p for p > nω 0τi ). Different values of Ω s i result in a broadening   and, for high indices of p, even overlapping of Bessel satellites. Due to the pϕ s i phase term in Eq. (6), which vanishes for p ¼ 0, contributions to the central satellite add up coherently, thus making its power density proportional to the square of the number of particles. For all other satellites, the instantaneous power density is nondeterministic, since it depends on the random synchrotron phases of all particles, and is proportional to the number of particles in the bunch.
Let us consider the PSD, PðωÞ (where ω ≠ nω 0 ), of a pick-up signal IðtÞ ¼ P N i I i ðtÞ, which is the sum of the individual contributions of N particles. As IðtÞ is a widesense stationary process, the Wiener-Khinchin theorem [13] gives us where cðτÞ ¼ hIðtÞI Ã ðt − τÞi is the autocorrelation function of IðtÞ, h·i denotes ensemble averaging and I Ã denotes the complex conjugate of I. The previous equation can also be written in the following form As IðtÞ is a wide-sense stationary process, we do not need to specify time t. Since the synchrotron phases are uniformly and independently distributed, the first sum term of the expected value vanishes. In the second one the synchrotron phases cancel out, so we can write: If we now denote the PSD of I i ðtÞ as P i ðωÞ we can write so the PSD is just the sum of single particle contributions.

IV. MATRIX FORMALISM
From the previous section we know that, in the absence of intrabunch coherent motion, the time averaged cumulative power spectrum of N particles is equal to the sum of the individual particle spectra. From Eq. (6) we know that differences in the individual particle spectra depend only on the particle's synchrotron amplitude, as synchrotron frequency can be expressed as a function of the amplitude [Eq. (3)]. Synchrotron phase does not influence the single particle's PSD. Therefore, the Schottky spectra are explicitly dependent on the distribution of synchrotron amplitudes.
Let us assume that we know the distribution gðτÞ of synchrotron amplitudes among the particles. We may then calculate PðωÞ, the power at a given frequency, as: where Pðω;τÞ is the PSD at frequency ω of a particle with synchrotron amplitudeτ. This can be seen as the continuous analogue of Eq. (7).
Experimentally, the power spectral density is estimated as the squared magnitude of the signal's discrete Fourier transform (DFT). We adopt the notation P DFT for such an estimate. For a given frequency binning ω 1 ; …; ω m , we shall have then If we discretizeτ to a finite set of valuesτ j , 1 Eq. (8) takes the form: wheregðτ j Þ is the probability mass function of the discrete approximation of gðτÞ.
The above equation can be expressed in terms of discrete frequencies, written in matrix form as: 1 Every continuous distribution may be approximated with an arbitrary precision by a discrete distribution.
The columns of matrix M correspond to the spectrum of a single particle with synchrotron amplitudeτ i . Vector A represents the discrete approximation of the synchrotron amplitude density and vector S is the DFT estimate of the total signal PSD, which can be compared with the experimentally obtained Schottky spectrum. As previously stated, considered frequencies ω i must not cover frequency bins corresponding to the p ¼ 0 satellite, as these add up coherently and Eq. (8) does not hold. One should also note that matrix M depends on the nominal synchrotron frequency. Therefore, we shall use the notation MðΩ s 0 Þ.
The expression of Eq. (9) is a very convenient tool for studying Schottky spectra. It enables us to simulate spectra, for different beam conditions and bunch shapes, without the need to perform time consuming Monte Carlo simulations. All we need to do is to calculate n single particle spectra withτ ranging fromτ 1 toτ n .

V. BUNCH SHAPE CALCULATIONS
In order to estimate the longitudinal bunch shape, we gate our acquisition system on a single selected bunch and calculate its average discrete experimental spectrum S exp . Its length, which corresponds to the spectral resolution, depends on Schottky Monitor's sampling rate and desired sampling duration. As we aim for real-time measurements, a sampling duration of around 1 s was chosen, which in the case of LHC Schottky Monitor results in 32768 frequency bins and a spectral resolution of 0.69 Hz. It is high enough resolution to observe the inner structure of Bessel satellites, which is crucial for our further considerations. Extensive details on the acquisition architecture can be found in [2]. For all longitudinal studies' results presented in this paper, we limited ourselves to 2854 rows (jω − nω 0 j ∈ ½20; 1000 Hz). This frequency range depends on the nominal synchrotron frequency, has therefore to be different for LHC Injection and flat-top and will of course be different for other machines. As mentioned before, we need to exclude central bins, as well as corresponding rows from matrix MðΩ s 0 Þ. Similarly, should other regions of the experimental spectrum be compromised by the presence of coherent peaks, they can also be omitted in the calculation of the MðΩ s 0 Þ matrix. The number of considered amplitudesτ i [number of columns of MðΩ s 0 Þ] is a trade-off between the problem complexity and the discretization error of Eq. (9). From experience we determined that n ¼ 50 is satisfactory for our objectives, however the impact of this parameter was not rigorously investigated.
The core of our approach is to minimize the cost function where the log functions are taken point-wise and j·j is the standard Euclidean norm. We use a logarithmic cost function here in order to be more sensitive to the lowmagnitude spectral peripheries and more robust to noise, as it was observed that the noise level in the spectrum is proportional to the spectrum value. Therefore, a usage of logarithm results in correct fitting to the measured data.
Having found the optimal Ω s 0 and A, we can estimate the longitudinal bunch shape and synchrotron frequency distribution using Eqs. (3) and (5) respectively. It is certain that the experimental spectrum is susceptible to noise and finite time averaging effects. It may therefore happen that the pair ðΩ s 0 ; AÞ which minimizes CðΩ s 0 ; AÞ is different from the true nominal synchrotron frequency and amplitude density. An example of such a situation is illustrated in Fig. 5, revealing exotic shapes for the estimated amplitude distribution and bunch shape, and where we have labeled this type of fit as "free fit".
In order to overcome this feature, our proposed solution is based on the assumption that synchrotron amplitude densities follow a Rice distribution [14], which is the distribution of distances from the origin for samples taken from a circular 2D-normal distribution. It is determined by two parameters, standard deviation σ N 2D and the modulus of mean ν N 2D of the mentioned 2D-normal distribution. 2 While a correlation between the Rice distribution and the distribution of synchrotron amplitudes is plausible at the moment of injection (the synchrotron amplitude is proportional to the distance from the origin in the longitudinal phase space), the question arises why it would remain valid after rf manipulations such as the ones performed during the energy ramp. Not knowing the answer to this question, we base our assumption on observations of bunch shapes measured by the WCM at different beam phases, and their corresponding synchrotron amplitude densities [calculated from Eq. (5)], which confirm this hypothesis. We present typical amplitude densities of LHC ion beams, together with the corresponding Rice distribution fits in Fig. 6. Our assumption may be seen as a regularization, that is, introducing information which helps to solve an ill-posed problem by preventing solutions from wrongly compensating the errors. Furthermore, the number of fitting parameters is also dramatically reduced, from n parameter entries in vector A (n ¼ 50 in our implementation), to the 2 parameters: ν N 2D and σ N 2D .
Finding a solution to nonlinear problems is not always possible analytically. Therefore we decided to apply a differential evolution algorithm [15] implemented in a SCIPY library [16] in order to find the parameters which minimize the cost function CðΩ s 0 ; AÞ. Default algorithm settings were used, apart from popsize ¼ 60 and mutation ¼ ð0.5; 1.2Þ.
Matrices MðΩ s 0 Þ were pre-calculated in order to reduce the computation time needed for the evaluation of the cost function. In order to determine the bunch shape and synchrotron frequency distribution, we fit 5 parameters in total. The first three have already been mentioned, these are μ and σ of the Rice distribution and the nominal synchrotron frequency. Additionally, we need to fit the scale c 1 , as the magnitude of the observed PSD may change, and finally, we need to take into consideration that some information may be masked by noise and we may actually only see the top part of the spectrum. 3 It is worth noting, that the noise level is substracted from S exp during preprocessing. The final cost function including all these parameters takes the form where S min is the minimal value of S exp . It turned out, that in the analysed spectra the noise parameter c 2 was observed to be negligible.
In order to increase convergence rate, parameters were bounded within the ranges specified in Table I, which are broad enough to include all possible physical solutions according to the machine configuration. Figure 7 shows an example comparison between the experimental spectrum (in blue), the spectrum obtained after the "free fit" (in orange) and the spectrum obtained after the Rice fit (in green). It can be seen that both obtained spectra follow very well the overall experimental spectrum, including the fine details of the internal structure of the Bessel satellites. Finally, comparing the calculated profiles with the ones obtained from WCM measurements, confirms the accuracy of the proposed method. This is shown in Fig. 8 where a WCM profile measurement is compared with bunch profiles calculated from several spectra acquired from longitudinal spectra of both horizontal and vertical Schottky systems.  I. Parameter bounds used for SciPy differential evolution algorithm. Scale and noise bounds are not given, as they are strictly dependent on spectra processing. Rice and free fit spectra compared with an experimental Schottky spectrum. There is no fit for the p ¼ 0 satellite, as it adds up coherently and is therefore not described by Eq. (8). 3 The scale parameter c 1 should be such, that values of c 1 MðΩ s 0 Þ · A and S exp are comparable. As columns of MðΩ s 0 Þ have approximately the same sum and A sums up to one, the scale parameter should be approximately equal to sumfS exp g=sumfM ·;i ðΩ s 0 Þg, where sumfM ·;i ðΩ s 0 Þg is the sum of any single column. Due to effects of postprocessing (such as baseline substraction), and errors of above mentioned approximations, we scan one order of magnitude range around the predicted value of c 1 . The offset parameter c 2 should be a fraction of maxfS exp g. In our implementation the whole range ½0; maxfS exp g is scanned.

VI. CONCLUSION
Within the scope of this paper, under the assumption of no intrabunch coherent motion, we have derived the relationship between three important bunch characteristics: longitudinal bunch profile, distribution of synchrotron amplitudes and distribution of synchrotron frequencies. Following, we have linked these characteristics to the Schottky spectrum in the form of matrix equation Eq. (9). Finally, we have presented a method of solving equation Eq. (9), that is estimating longitudinal bunch profile (as well as other mentioned characteristics) from the measured Schottky spectrum.
The results obtained have been verified in three stages. First, the experimental Schottky spectrum was compared to the one obtained from the optimization procedure showing that the Rice-based fit compares well with the observed PSD (and is also similar to the free fit). Secondly, bunch profiles calculated from vertical and horizontal monitors were observed to be self-consistent. Finally, comparing the calculated profiles with those obtained with the WCM confirms the accuracy of the proposed method.
The aim of this study is not to shift the main purpose of the LHC Schottky systems into a longitudinal profile measurement device, for which the WCMs provide a more direct diagnostic, but as an important step toward improving Schottky-based estimations in the LHC. The proposed procedure will now be adapted to transverse signals, which contain additional information on tune and chromaticity, and where the estimated bunch profiles are envisaged to be used also as a quality indicator for the derived quantities.
An additional advantage of the derived matrix formalism [Eq. (9)] is the fact that the fitting procedure does not require the use of all the points in the Schottky spectrum. For example, if certain spectral regions are affected by the presence of coherent peaks, they can be excluded from the analysis by simply omitting the corresponding rows of matrix MðΩ s 0 Þ. This linear formalism can also be used to predict the spectral response of Schottky diagnostics to various combinations of beam/machine parameters at different harmonics.

ACKNOWLEDGMENTS
The authors would like to acknowledge the stimulating discussions with O. R. Jones and T. Lefèvre, as well as the discussions and assistance of T. Argyropoulos, T. Levens, O. Marqversen, and M. Wendt.

APPENDIX: ANALYTICAL DERIVATION OF THE LONGITUDINAL BUNCH PROFILE
The aim of this Appendix is to derive Eq. (5). We shall start from Obtaining the explicit form of g τ;τ ðτ;τÞ is not straightforward, as τ andτ are not independent, but it can be derived from the joint distribution of initial synchrotron phases and amplitudes g ϕ s ;τ . We can write g ϕ s ;τ ðϕ s ;τÞ ¼ g ϕ s ðϕ s ÞgτðτÞ ¼ gˆτðτÞ 2π ; as these random variables are independent and ϕ s is uniformly distributed. In addition, let us define the transformation u ¼ ðu 1 ; u 2 Þ∶ðϕ s ;τÞ ↦ ðτ;τÞ; where u 1 comes from Eq. (4) and u 2 is the identity function ofτ: u 1 ðϕ s ;τÞ ¼τ cos ðΩ s t þ ϕ s Þ; u 2 ðϕ s ;τÞ ¼τ: Conversely, having the pair ðτ;τÞ, we can determine ϕ s as one of the following: