Predicting the transverse emittance of space charge dominated beams using the phase advance scan technique and a fully connected neural network

The transverse emittance of a charged particle beam is an important figure of merit for many accelerator applications, such as ultra-fast electron diffraction, free electron lasers and the operation of new compact accelerator concepts in general. One of the easiest to implement methods to determine the transverse emittance is the phase advance scan method using a focusing element and a screen. This method has been shown to work well in the thermal regime. In the space charge dominated laminar flow regime, however, the scheme becomes difficult to apply, because of the lack of a closed description of the beam envelope including space charge effects. Furthermore, certain mathematical, as well as beamline design criteria must be met in order to ensure accurate results. In this work we show that it is possible to analyze phase advance scan data using a fully connected neural network (FCNN), even in setups, which do not meet these criteria. In a simulation study, we evaluate the perfomance of the FCNN by comparing it to a traditional fit routine, based on the beam envelope equation. Subsequently, we use a pre-trained FCNN to evaluate measured phase advance scan data, which ultimately yields much better agreement with numerical simulations. To tackle the confirmation bias problem, we employ additional mask-based emittance measurement techniques.


I. INTRODUCTION
Many modern particle accelerators are tuned to achieve as small transverse beam emittance as possible. This is due to the fact that most users demand for the highest beam brightness possible. Beam brightness is important for many accelerator applications, such as ultrafast electron diffraction [1], free electron lasers [2] and the operation of new compact accelerator concepts in general (e.g. [3][4][5][6]). A common definition of brightness is [7] B = ηI π 2 ε x ε y , where η is a form factor close to unity, I is the beam peak current and ε x,y the horizontal and vertical transverse emittance respectively. Hence, in order to maximize B, transverse emittance has to be minimal. There are multiple methods to characterize the transverse emittance. One of the most common techniques is the phase advance scan technique, where the transverse beam size is recorded on a screen vs. the focusing strength of an upstream quadrupole or solenoid magnet [8][9][10]. The data can then be fitted based on the beam envelope equation. Space charge effects can be included to some extent [11,12]. Instead of scanning the focusing strength of a magnet, also multiple screens can be used to record the beam size vs. the phase advance. Other -potentially single-shot -methods involve insertion of masks into the beamline, which then, subsequently, can be imaged on a downstream screen [13]. Coupled with advanced reconstruction algorithms these methods are capable of delivering reconstruction of the core 4D phase * frank.mayet@desy.de space [14]. In this work we concentrate on the phase advance scan technique, as this is the easiest one to implement, only requiring standard beamline components.
One of the limitations of the phase advance scan technique is that there is no closed description of the beam envelope for space charge dominated beams [11,12]. It is therefore difficult to apply the method in this regime. Space charge dominated beams especially occur, for example, in the injector part of high-brightness electron sources, where the beam is still non-relativistic. In order to quantify whether a beam is space charge dominated, the so-called laminarity parameter ρ can be calculated [15]. This parameter represents the ratio between the space charge term and the emittance term of the beam envelope equation. It is given by where I is the peak current of the beam, I A ≈ 17 kA is the Alfvén current and ε n = βγε is the normalized emittance with the Lorentz factor γ and β = v/c. In case ρ 1, the beam can be considered as space charge dominated (laminar flow regime). Otherwise the evolution of the beam envelope is dominated by the emittance pressure (thermal regime).
In this work we show in simulation that it is possible to successfully analyse phase advance scan data for ρ 1 beams using a pre-trained fully connected neural network (FCNN). Subsequently, we apply the method to measured data. Machine learning and neural networks in particular have recently been used in the context of accelerators for various purposes. These include, among others, fault detection of machine components [16], machine stability optimization and analysis [17][18][19], virtual diagnostics [20][21][22], beam quality optimization in plasma accelerators [23,24] and orders of magnitude speed-up in multiobjective optimization of accelerator parameters [25]. Here, we focus on the analysis of otherwise difficult to interpret measurement data.

II. MEASUREMENT TECHNIQUE
The state of a single particle with respect to a given design, or reference trajectory, is usually defined by the 6D phase space vector where x = p x /p z and y = p y /p z are the horizontal and vertical divergence respectively, x, y, z the distances of the particle from the reference trajectory and δ = ∆p/p 0 is the relative deviation of the particle's individual momentum from the reference momentum. p x , p y , p z are the three momentum components and () T denotes the transpose of a matrix. We are interested in the evolution of the 4D transverse phase space vector Assuming negligible correlation between the evolution of the phase space coordinates in the x and y planes, it is possible to treat them separately, yielding the two subspace vectors A common framework to describe the evolution of a charged particle is linear beam optics, where only linear transformations of x and y are taken into account. Each beamline element is represented by a so-called transfer matrix defined by the relation where x 0 = (x 0 , x 0 ) T is the initial phase space coordinate and The 2 × 2 matrix for a simple drift is given by where s is the drift distance. Applying this matrix to x 0 would result in x = x 0 + x s, x = x 0 , as expected.
In an experiment only the rms beam size σ x = x 2 is accessible, where denotes the second central moment. Applying this to the general equation with ε x = x 2 x 2 − xx 2 . Equation 10 is the socalled rms envelope equation, which can be used to determine the transverse rms emittance ε x using a suitable (i.e. tunable) beam transformation M. Note that Eq. 10 does not take any space charge effects into account and is hence only valid in the ρ ≈ 1 regime. Figure 1 shows a sketch of a potential measurement scenario. The elements and distances are chosen according to what is installed at the ARES electron linac at DESY, Hamburg [26]. The transfer matrix of the double solenoid magnet can be written as where l D is the drift distance between the two single solenoids and f the focal length of each solenoid. Here the approximation that f is larger than the length of the solenoid was used, i.e. the thin lens approximation. The focal length of a solenoid is given by [27] f where B z,max is the peak magnetic field, q the particle charge and p z the average longitudinal beam momentum.
z,max is the second field integral of the on-axis magnetic field. By inserting the expression M D (l S ) · M DS , where l S is the drift between the solenoid and the screen, in Eq. 10, it can be seen that now the M ij elements can be conveniently adjusted in the experiment as B z,max is varied. The emittance at the position of the solenoid can thus be determined by fitting the recorded σ x,i vs. B z,max,i at the screen with Eq. 10. It is possible to include transverse space charge forces into the model to some extent. This is done by including a defocusing term in the drift between the focusing element and the screen. Considering a uniformly charged cylindrical bunch with radius R and length L, the envelope equation then reads in differential form [12] where G(ξ, A) ∈ [0, 1] is a form factor, which depends on the centered longitudinal intra-bunch coordinate ξ and the rest frame aspect ratio A = R/(γL). P is the socalled generalized perveance given by where Q is the total charge of the bunch, 0 the vacuum permittivity and m e the electron rest mass. In principle Eq. 13 can now be used to construct a similar fit model to Eq. 10. There are a number of caveats to take into account, however: • The model is only fully valid for a perfectly cylindrical bunch.
• The form factor G depends on the aspect ratio, which depends on the transverse beam size and bunch length, both not being constant in the experiment (especially in the over-focused part of the scan).
• Since the concept of the envelope equation relies on the emittance being a constant of motion, nonlinear space charge forces are intrinsically neglected in this approach.
• The perveance term depends on the bunch length, which might not be accessible to sufficient precision in the experiment.
• Equation 13 cannot be solved analytically and has to be approximated by a polynomial series (see [11] for a detailed description).
All of these caveats lead to the conclusion, that the envelope equation based data analysis method is not ideal in the ρ 1 regime. Independent of the value of ρ, two more criteria need to be met in order to ensure an accurate fit result. Considering the third term of Eq. 10, the purely mathematical criterion can be derived [11]. This criterion ensures the numerical significance of ε x . The second criterion is based on the fact that the scan needs to include a minimum. The initial beam optics needs to be setup, such that a potential focus of the beam lies behind the screen used to measure the beamsize. At the same time, the focusing element needs to be strong enough to focus the beam onto the screen, which implies a constraint on the distance between focusing element and screen. Using Eq. 10, these considerations can be summarized by the criterion where (σ eff ) = −(σ x,0 /f max −(σ x,0 ) ). In case both of the two criteria are fulfilled, the emittance can be retrieved. Based on the aforementioned considerations, we propose using an alternative way to analyze phase advance scan data. Specifically, we propose using a pre-trained FCNN to overcome the problem of the incomplete fit model in ρ 1 cases, as well as the criterion described by Eq. 15. To this end, we have performed a simulation study, which is presented in detail in the following sections. The resulting FCNN was then subsequently applied to real world data, as shown below.

III. SIMULATION STUDY -METHODOLOGY
It has been shown already in 1989 that neural networks with only one unbounded hidden layer can approximate any Borel measureable function from finite dimensional space to another to arbitrary precision [28,29]. More recent research focuses on the expressiveness (approximation accuracy) of both depth (i.e. the number of hidden layers) and width (i.e. the number of artificial neurons in a layer) bounded networks. In [30], for example, the authors show that any Lebesgue integrable function f : R n → R on n-dimensional space can be approximated to arbitrary accuracy by a fully connected width-(d in + 4) ReLU network with respect to the 1 norm as a measure of approximation quality. In other words, the network represented by the transfer function F satisfies R n |f (x) − F (x)|dx < , ∀ > 0 (see [30], Theorem 1). ReLU here refers to the the socalled Rectified Linear Unit neuron activation function, definded by ReLU(x) = max(0,x) and d in is the input dimensionality. This width boundary w has since been refined and generalized for example in [31,32] where d in and d out are the input and output dimensionality respectively. Limits on the depth can be estimated in specific cases in terms of the so-called modulus of continuity of f , given by ω f (ε) = sup{|f (x) − f (y)|||x − y| ≤ ε}, where ε is an arbitrarily small change in the argument of f . For continuous functions f : [0, 1] din → R + the depth of a d in + 2 wide network N can, for example, be expressed as depth(N ε ) = 2 · d in !/ω f (ε) din , cf. [31]. We note that in practice the specific layout of a neural network is often determined experimentally, as the aforementioned boundaries are merely based on proofs of existence.
In this study, we aim to map the phase advance scan data to the normalized transverse emittance at the fo-cusing element. Mathematically, this means we assume a connection of the scan data to the physical quantity of the form f : R din + → R + , where the dimensionality d in is given by the number of scan data points. Note that, based on the knowledge of the problem, we can always map (normalize) the input data from R din + into [0, 1] din . The function f operates on the measure space (R din + , B, λ) with the Borel-σ-algebra B and the Lebesgue measure λ. It is hence measureable in the mathematical sense. In addition, we expect f to be a continuous function based on the physical background of the problem. We can hence conclude that f is Lebesgue integrable and suitable to be approximated for example by a width bounded ReLU network.
To validate this approach, we setup a simulation study based on the simple beamline layout shown in Fig. 1. The main simulation study is split into three parts: 1. Building a large number of data sets (training, validation and test), 2. Training the FCNN and evaluation of the performance using the test data sets, 3. Comparison of the FCNN performance to the traditional fit method, as discussed above.
The first step of the simulation study is to build a large number of data sets. Creating a data set consists of two steps: • Numerical tracking of the beam from the cathode to the location of the solenoid, • Numerical simulation of the solenoid scan.
First, the emittance at the solenoid position is determined by numerical tracking of the particles. In this step the solenoid field is set to zero. In addition to the emittance, other beam parameters, such as the beam size, divergence, or bunch length can be recorded as well. Then, the simulation domain is extended up to the position of the screen, which is used in the experiment to record the beam size vs. the solenoid focusing strength. The experiment is then simulated for M focusing strength settings. It is important to setup the scan range such that the resulting data includes the beam size minimum, i.e. the focus, as it carries most of the information about the emittance at the solenoid position [11]. We use the well established code ASTRA [33], which takes space charge effects into account. The beam size vs. focusing strength scan data functions as the data set to be interpreted by the FCNN. Over the course of the study, a specific way to prepare the input data turned out to yield the best results. For each scan, M/2 scan points centered around the minimum beam size are interleaved with the relative focusing strength difference where B foc is the setting corresponding to the minimal beam size. The data set S in is then of the form where σ i is the ith rms beam size. Each of these data sets is labeled with a set of important beam and simulation input parameters. These labels are then used to perform so-called supervised training of the FCNN. After the learning process, the FCNN is able to predict each of these parameters from given scan data, which is prepared according to Eq. 17. Figure  For this particular study N = 16066 random data sets with M = 40 were produced. Each data set differs in the three key ASTRA input parameters total charge, laser spot size and cathode emission time. Table I summarizes the parameter ranges used for this study, which are losely based on typical settings at the ARES linac at the time. The parameters are varied according to a uniform distribution. The ARES S-band gun was simulated with an, at the time available, peak gradient of 65 MV/m, resulting in a final γ = 6.8. Based on the parameter ranges shown in Table I, a convergence study in terms of required macro particles in the numerical simulation was performed. For the highest possible charge density, 10000 particles were found to be sufficient. The neural network was implemented using the Ten-sorFlow framework [34]. The input layer has M neurons, corresponding to the length of S in . Then one hidden layer with M and two hidden layers with M/2 neurons are added in order to capture non-linearities in the system. The system is then coupled to the output layer of size M out , which corresponds to the number of ASTRA input and simulated beam parameters to be predicted by the network. The overall layout is hence This particular layout was determined empirically. Each neuron is coupled to every neuron of the following layer, or in other words the layers are fully connected. The neurons are activated using the well established rectified linear activation function (ReLU) [35,36]. Training of the network is performed using a combination of the adam and adagrad gradient decent algorithm [37] with the mean squared error (MSE) as the loss function. The available N data sets are split into three categories. N tra training sets, N val validation sets and N tes test sets. The training sets are used to adjust the neuron weights during the training procedure, while the performance of the network is judged after each so-called epoch based on the validation sets, which are not used during training. This is done to avoid overfitting the training data. An epoch refers to one forward and backward pass of the entire training data. Finally, the performance of the resulting model is determined using the test sets, which have not been part of the learning procedure at all. We use the common split of N tra = 0.6 · N , N val = 0.2 · N and N tes = 0.2 · N . The network was trained for ∼ 10000 epochs using adam and another ∼ 10000 epochs using adagrad [38].

IV. SIMULATION STUDY -DATA SET
Before evaluating the prediction performance of the neural network, it is useful to inspect the training data set. Since we are especially interested in analyzing phase advance scan data for space charge dominated beams, the laminarity parameter ρ at the solenoid position was calculated for each data set (cf. Eq. 2). Figure 3 shows ρ vs. the bunch charge. The color scale indicates the laser spot size on the cathode used in the particular simulation. In addition, the distribution of ρ across the whole data set is shown. It can be seen that all of the data sets lie in the ρ 1, i.e. space charge dominated, regime (ρ min = 17.8). Also, the higher the charge, the higher the value for ρ, as expected. In addition, the color scale reveals that the smaller the laser spot size on the cathode, the higher the value for ρ. The sensitivity of ρ on the laser spot size strongly depends on the bunch charge.
As noted above, the traditional fit method only works if Eq. 15 is satisfied. Figure 4 shows the fit feasibility criterion for each data set, with the same color code as in Fig. 3. None of the data sets satisfies the criterion, which leads to the expectation that the traditional fit method should not work well on the training data (and with that in reality for the ARES working point, which is the basis for the parameter space shown in Table I

V. RESULTS AND COMPARISON
In this section the performance of the pre-trained FCNN is presented. We also compare its performance against the traditional fit method discussed above. In order to evaluate the performance of the FCNN, we try to predict the labels of the N tes test data sets, which were not used in the supervised training procedure. The main goal of the study is to predict the transverse emittance, but since the labels include a number of other simulation and beam parameters, the FCNN also provides predictions of these. In order to better quantify the prediction performance for the different label components, the relative error between predicition and truth was calculated for each data set. This is shown in Fig. 5. In addition to the error distributions, a radar plot visualizes the prediction performance in terms of number of data sets in 1 %, 5 % and 10 % relative error intervals respectively (see Table II for the actual percentages). In the ideal case, the heptagon would be filled completely. Inspection of the results reveals that some quantities are predicted much better than others. Specifically, it can be seen that the cathode emission time is predicted particularly bad. This result is somewhat expected, however, because this quantity refers to the longitudinal phase space at emission time, which cannot directly be accessed via a transverse beam size measurement. All quantities, which refer to the transvere phase space at the solenoid show very good prediction performance with < 5 % error. The prediction performance of both bunch charge and bunch length at the solenoid needs to be considered in more detail. In case of the bunch charge, values 0.5 pC are predicted much less accurately. This can be explained by the lack of significant space charge effects, which alter the shape of the beam size vs. focusing strength curve, effectively leading to degeneracy w.r.t. the initial bunch charge. Despite being a quantity of the longitudinal phase space, the bunch length is generally predicted with an error < 10 %. This is because here the bunch length is directly correlated with the bunch charge and hence space charge effects. The prediction performance decreases towards smaller bunch lengths, which can be explained by the fact that the bunch length at the solenoid actually increases with the initial bunch charge. Therefore the same argument applies as for the bunch charge.
From the prediction results and the ground truth a mean relative error was calculated over the whole test data set, yielding a mean prediction error for the emittance at the solenoid of 1.0 %. Beam size and divergence at the solenoid are predicted very accurately with an error less than 0.1 %. In addition to the beam parameters at the solenoid position, ASTRA input parameters were predicted. The laser spot size is predicted with an error down to 0.7 %. Emission time and bunch charge are predicted with errors of 32.1 % and 27.1 % respectively. The fact that the laser spot size is predicted best out of the three input parameters is due to the fact that it has the strongest effect on the shape of the beam size vs. focusing strength data, especially for low charges (linear dependence of the thermal emittance). These results, as well as the percentage of predictions within a 1 %, 5 % and 10 % relative error interval are summarized in Table II.
The main goal of the study is to find a better way to determine the transverse emittance from phase advance scan data in the ρ 1 regime, as well as in regimes where Eq. 15 does not hold. It is hence useful to evaluate the emittance prediction performance in form of the relative error versus these two quantities. Figure 6 shows the result of this analysis using both the FCNN, as well as the traditional fit routine (cf. Sec. II).
As expected from Fig. 3 and Fig. 4, the traditional fit yields inaccurate results across the whole data set. The FCNN, on the other hand, performs much better even for very high values of ρ.

VI. EXTENSION TO MEASURED DATA
So far, the neural network was trained and tested solely with ideal phase advance scan data. This means that the training, validation, as well as the test data sets contain only perfectly evenly spaced data points with flawless beam size values. In addition, all scans were simulated within the same focusing strength range. Although this approach yields very good prediction performance for simulated data, it cannot be applied to experimental data, for several reasons. First, in a measured data set it  is not guaranteed that all data points are evenly spaced. It is also not guaranteed that the scan range corresponds to the trained one and that the measured values of the focusing strength are correct [39]. Finally, the measured beam sizes are subject to jitter and systematic measurement errors like resolution limitations.
In order to take all of this into account, the training procedure was modified. Each data set is now created with a slightly different focusing strength scan range. Since the number of scan points is kept constant, the spacing between data points is now slightly different every time, which might also be the case in reality. Furthermore, the first and last focusing setting were added to the range of predicted parameters (→ M out = 9). Measurement errors were taken into account by generating noisy data sets from the ideal sets by adding normally distributed errors. Both relative and absolute errors on focusing strength and beam size were considered with magnitudes based on experience at ARES. From each data set, N err = 100 noisy sets with relative errors and N err noisy sets with absolute errors were generated. In addition, the procedure was repeated, this time enforcing a 10 µm resolution limit on the beam sizes. Including the ideal data, the total number of data sets now increases to N tot = 2(2N err + 1) · N = 6458532. To visualize the importance of training the FCNN with noisy data, we fed data sets with increasing artificial noise to both the network trained solely on ideal data, as well as one trained on noisy data. The results are shown in Fig. 7. It can be seen that the population with less than 5 % decreases significantly with noise level, if the FCNN is not trained on noisy data.
We performed the same general analysis for the new network, as described above, and saw the same overall behaviour. The results are summarized in Table III. Compared to the network based on ideal data, the performance is slightly worse, but still for the majority of data sets the emittance is predicted with less than 5 % error.

VII. MEASUREMENTS AT THE ARES LINAC
As a real world test, we conducted emittance measurements using the phase advance scan technique at the ARES linac at DESY. The layout of the measurement setup is shown in Fig. 1. We took data for several bunch charges by adjusting an attenuator in the cathode laser beamline. The charge was measured both with a Faraday cup, which can be inserted into the beamline instead of This explains the generally better performance compared to Table III. the scintillating screen and a cavity based charge monitor ∼ 0.7 m downstream of the screen [40]. All measurements shown here were performed according to the procedure introduced in Sec. II. Transverse beam sizes were determined from camera images of a scintillating Ce:GAGG (Cerium doped Gadolinium Aluminium Gallium Garnet) screen. The spatial resolution of the system is specified to be ∼ 10 µm [41]. Figure 8 shows the emittance values obtained from the measurements using both the FCNN, as well as the traditional fit method. The data points are compared to an ASTRA simulation including space charge based on the machine settings at the day of the measurements, including uncertainty. It can be seen that the FCNN results are much closer to the expected values than the results obtained from the envelope equation fit. It is interesting to note that Fig. 8 reproduces the expected behaviour shown in Fig. 6, as the fit underestimates the emittance for charges < 0.5 pC and overestimates them for higher charges. The FCNN result follows the ASTRA curve much closer, mostly staying within the uncertainty of the simulation. In order to cross-check the obtained results, we performed additional emittance measurements using a grid mask based method, as described in [14], using the same machine setup. Three different grids were used for the measurements: An SPI G200TH TEM grid [42], an SPI G300 TEM grid [43] and a custom made pepper pot [44]. Measurements were performed for two different charge settings, 1 pC and 2 pC. These values were deliberately chosen to be in the strongly space charge dominated regime, where the traditional fit yields particularly bad results. At ARES, the grid masks are installed at the same z-position as the screen used to record the phase advance scan data. This means that the emittance obtained from the grid measurements will always be different from the phase advance scan result, as the phase advance scan yields the emittance at the position of the focusing element. The emittance is furthermore expected to be different, because in order to image the grid, the beam needs to be focused slightly before the grid, which can lead to emittance growth. Nevertheless, it is still possible to compare the measured values to ASTRA simulations, which would show that the ARES setup depicted in Fig. 1 can be well simulated with ASTRA. This would validate the FCNN results indirectly. Figure 9 shows an ASTRA simulation of the grid measurement using a 1 pC bunch with γ = 6.8. It can be seen that the emittance is strongly affected by focusing the beam down. The measurement results from all three grids, as well as the expected values from the ASTRA simulation are summarized in Table IV. The results are very close to the expected value in the 1 pC case. The measurement of the 2 pC beam shows a slightly higher than expected emittance value, which is in line with the high uncertainty of the high charge results shown in Fig. 8. We hence conclude that ASTRA simulates the ARES beamline shown in Fig. 1

A. Prediction of other parameters
As discussed above, the FCNN also predicts fixed machine parameters and other charge dependent beam parameters to varying accuracy (see Table III). Table V summarizes the predicted fixed machine parameters in comparison to what was used in the experiment. It can be seen that both the mean laser spot size and pulse length are predicted to be larger than the expected values. Since measuring the laser spot size on the cathode directly is very difficult in the ARES setup, the predicted values fall within the uncertainty. As discussed in Sec. V, the prediction of the laser pulse length should be treated with caution. The solenoid scan range is predicted well within the uncertainties. Figure 10 shows the prediction results for the beam charge, as well as the charge dependent beam parameters bunch length, beam size and beam divergence at the solenoid. As in Fig. 8, the data points are compared to an ASTRA simulation including space charge based on the machine settings at the day of the measurements, includ- ing uncertainty. It can be seen that the beam charge prediction fits the measured values well. Both beam size and beam divergence follow the ASTRA curve well, albeit at the lower end of the uncertainty, denoted by the shaded area. The bunch length follows a more linear charge dependence than expected from the ASTRA simulation, which might be attributed to either the not fully known temporal and spatial laser pulse shape at the cathode, as well as the overall prediction performance of parameters of the longitudinal phase space (see Sec. V).
In order to cross-validate the prediction results, an AS-TRA simulation using the mean predicted laser spot size and pulse length (see Table V) was performed. The results are shown in Fig. 10 as the blue dashed line. Indeed, a larger spot size and longer pulse length lead to results closer to the lower end of the uncertainty in all three cases, which can be explained by the reduced charge density. Remaining discrepancies might be explained by the not fully known temporal and spatial laser pulse shape at the cathode.  Table V).

VIII. CONCLUSION AND OUTLOOK
We have shown in simulation that a pre-trained fully connected neural network can be used to predict the transverse emittance from phase advance scan data even in the ρ 1 regime and in case a traditional envelope equation based fit is mathematically not feasible. We have optimized the network for real-world measurement data and achieved < 5 % error for the majority of the test data set population (89.2 %), resulting in a mean relative error of 2.5 %. We have applied our method to measurements conducted at the ARES linac at DESY and compared the predictions to numerical simulations using the well benchmarked code ASTRA, as well as results obtained from the traditional fit method. As expected from the simulation study, the FCNN predictions are much closer to what is expected from the numerical simulation. We have furthermore cross-validated the results using additional emittance measurements based on a grid mask based method.
In addition to the transverse emittance, the network also predicts other key beam and machine parameters to varying accuracy. While quantities directly tied to the transverse phase space are predicted as accurate or better than the emittance, quantities tied to the longitudinal phase space, such as the bunch length, are predicted less accurate, as expected. It should be noted, that in our study the gun setting is not a variable in the process of training the FCNN. This means that for each gun setting (gradient and phase) a separate FCNN has to be trained. Inclusion of these two parameters could be part of a future study. Furthermore, difficult to directly access parameters, such as the thermal emittance could be added. In conclusion, we have demonstrated that pretrained FCNNs can be a powerful tool for the analysis of previously difficult to interpret data sets.

IX. MODEL AVAILABILITY
The FCNN models are available from the corresponding author upon request in TensorFlow format.