Generation of highly mutually coherent hard x-ray pulse pairs with an amplitude-splitting delay line

Beam splitters and delay lines are among the key building blocks of modern-day optical laser technologies. Progress in x-ray free electron laser source development and applications over the past decade is calling for their counter part operating in the Angstrom wavelength regime. Recent efforts in x-ray optics development have demonstrated relatively stable delay lines that most often adopted the division of wavefront approach for the beam splitting and recombination configuration. However, the two recombined beams have yet to achieve sufficient mutual coherence to enable applications such as interferometry, correlation spectroscopy, and nonlinear spectroscopy. We present the first experimental realization of the generation of highly mutually coherent pulse pairs using an amplitude-split delay line design based on transmission grating beam splitters and channel-cut crystal optic delay lines. The performance of the prototype system was analyzed in the context of x-ray coherent scattering and correlation spectroscopy, where we obtained nearly identical high-contrast speckle patterns from both branches. We show in addition the high level of dynamical stability during continuous delay scans, a capability essential for high sensitivity ultra-fast measurements.

X-ray photon correlation spectroscopy, for example, is an extension of dynamic light scattering, reaching the atomic length scale and femtosecond time scale. It has the potential to directly probe the femtosecond and picosecond time scale (fs-ps) dynamics of disordered matters and their phase transitions that are currently inaccessible by any other existing experimental probes, e.g. many-body dynamics in super-cooled liquids, dynamical heterogeneity, and strong-to-fragile transitions [10,11]. Strong interest in those multi x-ray pulse capabilities have driven tremendous efforts in the design and implementation of hard x-ray split-delay optics at several X-ray FEL facilities over the past decade [12][13][14][15][16][17][18][19][20]. Existing designs and systems have established routine and stable delivery of hard x-ray pulse pairs with good efficiency in recent years [16,18]. One last and the most demanding requirement that has yet to be met is the preservation of mutual coherence between the two pulses in a pulse pair. Two of the primary remaining limitations relate to the performance of the crystal-optics-based beam splitters [21], and the pulse front tilt induced by the asymmetric channel-cut crystals that were used to enhance the beam stability and to change the delay time [19].
Coauthors of this paper proposed a new optical design in 2020, which uses transmission gratings as the beam splitter and recombiner, and a dispersion-compensated all-channel-cut 8-bounce delay line to adjust the path length [22]. Numerical studies showed a significant performance enhancement. In this paper, we report the first experimental realization of this novel optical concept. This prototype device demonstrates the capability of generating nearly-identical pulse pairs, as manifested in the nearly-identical high contrast speckle patterns obtained from both branches, which is a direct proof of a high degree of mutual coherence. We show in addition the capability of maintaining this high mutual coherence during continuous delay scans, which is unprecedented and essential for high sensitivity ultra-fast measurements.

II. EXPERIMENT SETUP
The overall layout of the experiment setup is shown in figure 1 (a) and (b). The design and numerical analysis of this setup was described in Ref [22]. The experiment was performed at the X-ray Pump Probe instrument using the diamond (111) monochromator which selects a ∼0.5 eV bandwidth from the incident FEL output of a boarder bandwidth [23,24]. Downstream of the monochromator, the size and transverse position of the incident beam was further defined by upstream slits. The slits were closed down to 500 µm to form a square aperture. The x-ray beam was first split by the upstream transmission grating G1. The grating was fabricated on a single crystal diamond substrate of 4 mm × 4 mm × 100 µm. The grating pattern was produced by high resolution electron beam lithography in hydrogen silsesquioxane resist. The pattern was then transferred to the diamond substrate by oxygen plasma assisted reactive ion etching [25,26]. The grating period is 500 nm and the overall size is 0.6 mm × 1.5 mm. A photograph and a high resolution scanning electron microscopy image of the diamond grating are shown in figure 1 (d) and (e) , respectively. The groove depth of the grating is about 5 µm. During the experiment, the grating was rotated around the x axis to increase the effective groove depth towards 8 µm in order to reach a π phase shift that maximizes the photon flux in ±1 diffraction orders. Diffracted beams from G1 were delivered to the main crystal-optics table (SD table) through an evacuated beam path. The distance between G1 and the first channel-cut crystal (CC1) on the SD table was 2.90 m.
The SD table, shown in figure 1(c), supports the motion control mechanisms for 6 silicon channel-cut crystals (CCs), which are referred to as CC1 to CC6 in this paper. CC1 and CC6 are regular channel-cut crystals with pairs of polished parallel optical surfaces, but different gap sizes: 25.15 mm and 25.8 mm respectively. CC2 to CC5 are asymmetric channel-cut crystals (ACC) with asymmetry angles of 5 • as can be seen from figure 1 The dimensions of the ACCs are identical to those described in Ref [19]. All 6 CCs utilized (220) Bragg reflections. Effectively two delay lines were formed by the 6 Bragg reflection pairs. the fixed branch (CC1 and CC6) and the delayed branch (CC2 to CC5). In the delayed branch, CC2 and CC3 were mounted on a single air-bearing stage. The relative path length between the two branches can be adjusted by moving the air-bearing stage along the x direction as indicated in figure 1(b). At 9.83 keV, the spatial separation of ±1 orders of diffraction is ∼1.5 mm at CC1.
This distance allows full spatial separation of the two diffraction orders. The +1 order of diffraction from G1 was picked up by CC1, while the −1 order of diffraction was picked up by CC2. The 0 order diffraction is filtered out by CC2, being outside its reflection bandwidth. The SD table was enclosed in a helium environment to reduce air absorption.
Downstream of the SD table, the two exit beam paths from CC5 and CC6 merged at the second grating G2 which shares identical parameters as G1.
Further downstream of G2, only the −1 order of diffraction from the fixed branch and +1 order from the delayed branch became parallel to the incident beam. The grating-induced angular dispersion was also fully removed by 2 deflections of opposite directions.
A beryllium Compound Refractive Lens (CRL) was used to focus the beam down to about ∼1 µm at the nominal sample plane. The slits upstream of the CRL were used to define the illuminated area of the lens to reduce the sensitivity of the focal position to upstream beam position change. The slits downstream of the CRL were used to block the other diffraction orders from G2. At the nominal sample plane, we could insert either a silica nanoparticle powder sample to produce coherent small angle scattering, or a scintillator based beam profile monitor to directly investigate the spatial property of the focused beam.
After another section of evacuated beam path, 5.40 m further downstream of the sample plane, an ePix100 x-ray area detector [27] was used to collect speckle patterns of the sample.
A beam stop was positioned in front of the the detector to block the direct beam. This section describes the alignment procedure of the system. First, the position and orientation of G1 was optimized by analyzing the forward diffraction with the beam profile monitor at the sample plane. By adjusting the orientation of G1, all diffraction orders were brought to the horizontal plane. G1 was rotated next around the x axis to maximize the flux in the ±1 diffraction orders. The optimized rotation angle was found to be 72 • ± 2 • . In the next step, optimal Bragg conditions of CC1 and CC6 were established using intensity diagnostics behind each crystal. This was repeated for the delayed branch from CC2 to CC5. We then brought the two exit beams from both branches to the same vertical position as the input beam by adjusting the tilting angles of the crystals. The two beams were then overlapped at the G2 location using another scintillator screen. In addition, an intentional small angular offset was added to the CC6 Bragg angle to match the slightly narrower bandwidth of the delayed branch. 6 To align G2, we used the scintillator screen at the sample plane to maximize the ±1 orders from fixed branch only. The optimized rotation angle was found to be 73 • ± 2 • . The spatial overlap between the two branches at the sample plane was finally established on the sample plane profile monitor, first with unfocused beams and then with focused beams.

III. ALIGNMENT PROCEDURE
After the two unfocused exit beams were spatially overlapped on the sample plane, a glassy carbon prism was inserted into the fixed branch to create a small crossing angle between the two branches. When the two beams overlap with each other within the coherence time, one would observe high contrast interference fringes. This allows us to determine the T 0 for the split-delay system.
The steering angle was determined by the prism shape and orientation. In our case, this angle was ∼ 5 µrad. Detailed analysis is presented in Appendix C. A few examples of interference fringes observed near T 0 are shown in figure 2 (a). The relative delay time as a function of the air-bearing stage position [22,28] follows the expression: where ∆(t) is the change of the delay time, the ∆(d) indicates the displacement of the airbearing stage, the θ Bragg is the Bragg angle, and α is the asymmetry angle in of the ACC which is 5 • . The relationship between the visibility of the interference fringes and the delay time allows us to determine T 0 and characterize the mutual longitudinal coherence between the two branches. The fringe visibility is calculated for a large number of single shot patterns at various delays between -40 fs and 40 fs and plotted in figure 2(b). The coherence time, i.e. the FWHM of this curve, is determined to be 11.3 fs. Since our measurement of ∆(d) is better than 1 µm (the corresponding ∆(t) is smaller than 1 fs), the uncertainty of the delay time is dominated by this visibility peak. Therefore the accuracy of the delay time is about 10 fs. A distinguished and asymmetric modulation of the overall bell-shape curve was observed on the tails. This modulation is a signature of the temporal tail structure of the output pulses as predicted in Ref [22]. The asymmetry can be attributed to the remaining bandwidth and spectral content differences between the two branches. We are able to reproduce the average contrast as a function of delay time by numerical modeling (the source code is available at the code base [29]). A prism steering angle of 5 µrad and a delayed branch detuning of by 5.2 µrad was used in the simulation.
The high contrast interference fringe pattern can be used for evaluating and optimizing represents the output wave vector of the delayed branch. The k F represents the output wave vector of the fixed branch. The k is the incident wave vector, g 1 is the photon momentum transfer from G1 to the fixed branch, g 2 is the photon momentum transfer from G2 to the delayed branch, p is the photon momentum transfer from the prism to the fixed branch and ∆ + is the photon momentum transfer from asymmetric Bragg reflections. We assume that | g 1 | = | g 2 |. Therefore, when G1 and in the horizontal plane, the interference fringe will be tilted with respect to the vertical axis.
By adjusting the orientation of G2, we could eliminate the interference fringe tilt angle, thus eliminate the vertical crossing angle.
Quantitatively measuring the fringe spacing also allows optimization of the crossing angle in the horizontal direction. Based on the prism steering calculation, we expected a 5.0 µrad crossing between the two branches if they were parallel prior to prism insertion (see detailed calculation in Appendix C). The measured fringe period shown here indicated that the crossing angle of the two branches was 10 µrad. The excess of horizontal angular crossing angle was a result of minor misalignment of the ACC crystals in the delayed branch within the Bragg angle bandwidth. For speckle measurements that will be presented later, the angles of the asymmetric channel-cut crystals were optimized such that, at T 0 , with the prism removed, there were no noticeable interference fringes.
We note in addition that even though very distinguished single shot interference patterns were observed at T 0 , the interference fringe vanishes with multi-pulse average. This indicates the expected absence of phase stability between the two branches. According to equation (1), to achieve phase stability at 9.83 keV, the positioning jitter of the air-bearing stage needs to be much smaller than 0.22 nm (This is the crystal translational motion that corresponds to a π phase shift.) and is far smaller than the actual 20 nm positional jitter of the air bearing linear stage.

IV. PERFORMANCE EVALUATION
In this section we present analysis of the measured photon throughput and the relative pointing stability between the two branches. In addition, we investigate the mutual coherence between the two foci in detail by comparing small angle speckle patterns generated from two branches.
We first measured the diffraction and transmission efficiency of G1 by imaging the diffrac- The comparison between the measured and theoretically optimal throughput is summarized in table I. While the measured throughput here is significantly lower than the 21% (9% and 12% respectively for the delayed and fixed branch) reported in [22], within in measurement uncertainty, the reduction can be fully accounted for considering the broader incoming beam bandwidth, actual diamond grating performance parameters, and absorption from beam path in air, x-ray windows, and x-ray diagnostics. This strongly supports the feasibility of approaching theoretical performance by further improving grating fabrication as well as eliminating air paths and windows. More details about the calculation can be find in Appendix B.
Next we discuss the relative stability of the two branches by analyzing the beam positions of the two foci measured in the sample/focal plane. The two output beams after G2 were focused by the CRL with a focal length of ∼1 m. The foci of the two relevant diffraction orders were intentionally separated using the glassy carbon prism between CC1 and CC6.  On the other hand, the system showed a high degree of tolerance to upstream instabilities: the relative position jitter between the two pulses was much smaller than each individual branch. This can be attributed to the use of amplitude splitting. The common beam motion is more apparent in long-term focus stability measurements. As shown in figure 4, during  the two beams. However, it cannot resolve detailed transverse profile of the focused beams due to limited spatial resolution (∼ 1 µm). Moreover, the beam profiles as well as the spectral content of the pulse pairs fluctuates from pulse to pulse following the input beam variations, potentially impacting the signal quality of an XPCS measurement. Therefore, we directly investigate the degree of transverse coherence via analyzing the small angle coherent scattering from a static silica powder sample [31]. This measurement was performed at 9.5 keV. In order to decouple the impact of the split-delay optics from upstream fluctuations and optics imperfections, we initially limited the input beam size with a 50 µm square aperture. The ePix100 detector recorded speckle patterns from either both or one of the two branches [32]. Visually high contrast and notably similar average speckle patterns were observed in all 3 scenarios as displayed in figure 5 (a-c). To quantify the similarity between the two pulses in the context of XPCS measurements, we evaluate the effective overlap µ, related to visibility degradation in the absence of sample dynamics. It is defined as where r is the intensity branching ratio of two branches with r ≡ i 1 /(i 1 + i 2 ). The subscript 1,2 of the intensity i and visibility β denotes the delayed branch and the fixed branch respectively. The angle of CC6 was detuned to achieve an equal intensity splitting (r ≈ 0.5) between the two branches. Several delay points spanning over the ∼10 ps time delay range of the system were selected for scattering measurements. For each delay point, a two-step visibility analysis was performed to get the visibility in the 3 conditions corresponding to r = 0, 0.5, 1. First, we used the droplet based 'greedy guess' algorithm to locate photon positions in each speckle pattern [33]. Then, from a large number of frames, a maximum likelihood estimator was applied to find the mode number that optimizes the likelihood of the negative binomial distribution from our photon statistics measurements, i.e., the probabilities of 1, 2, and 3 photons per pixel within the count rate range from 0.01 -0.1 photon per pixel [34,35]. The calculated effective overlap is plotted in figure 5 (d) for the selected delay points.
It is consistently above 90%. We note that before focusing the two output beams had a 22 µm horizontal relative motion when translating CC2 and CC3 together over a 10 mm scan, potentially arising from a ∼ 0.02 • asymmetry angle mismatch between CC2 and CC3. This led to a ∼200 nm relative horizontal motion between the two beams at focus (See Appendix F 2 for details). We compensated this beam offset by translating CC5 at each time delay as an optimization procedure to achieve optimal spatial overlap. At each fixed time delay, the summed-speckle contrast experienced less than 3% change during our half an hour measurement as plotted in figure 5 (e). However, as shown in figure 5 (f), 13 single-branch contrast values showed non-negligible variations across different time delays measured during a period of several hours. This implies that, even though the two branches can maintain a sufficient level of relative stability, the upstream beam condition variation can impact individual beam's transverse properties, e.g., the upstream beam trajectory/position drift may change the transverse portion of the beam that illuminates the slit and the lens.
This poses challenges in the data interpretation/normalization and can potentially lead to systematic errors in an actual XPCS measurement, in which intrinsic dynamics is also revealed through contrast changes. To overcome these types of drifts over the time scale of minutes and hours, a better scheme of measurement is to repeat time delays faster than the time scale of these drifts [19,36].
We thus performed fly-scan test with a speed of 0.3 mm/s or 0.28 ps/s. The contrast curves from the 3 conditions, extracted from scattering patterns which are grouped based on delay times, are plotted in figure 6 (a). A different slit setting was used to mitigate effects of beam relative offsets due to the gap mismatch between CC2 and CC3: the upstream slits were wide open, the slits right upstream of the CRLs were closed down to 150 µm so as to always illuminate the same area on the lens. The contrast values in this configuration are noticeably lower than those presented figure 5. This could be attributed to imperfections of upstream optics such as the known asymmetry angle in the beamline monochromator diamond crystal 14 [24]. On the other hand, individual branches are more similar across different delays, as we see significantly less contrast variations from each branch as a function of delay time. A bidirectionality was also observed, manifested in the small difference in contrast levels in the positive (delay increase) and negative (delay decrease) scan directions, likely due to cable tension. Nevertheless, as displayed in figure 6 (b), the overall effective overlap maintains at a high level, showing negligible changes in µ within the first 4 ps.

V. CONCLUSION
In summary, we have experimentally implemented the new x-ray split-delay line using a grating-based amplitude-splitting all-channel cut design. The system is able to generate femtosecond x-ray pulse pairs with significantly higher mutual coherence compared to previously realized systems. This is manifested in both the high contrast interference fringes and the high contrast two-pulse coherent small angle scattering patterns. We have also demonstrated the expected high relative stability between the two branches, which is well preserved in spite of the incoming x-ray beam pointing drift. The overlap stability during continuous delay scans, enabled by the channel-cut crystal pair translation with a single air bearing stage, allows fast and accurate delay-time repetition, which is essential for robust high-sensitivity time domain measurements. Being able to maintain the overlap between micron sized x-ray beams makes it possible to perform x-ray pump x-ray probe experiments with higher intensity x-ray excitation, e.g. enabling the generation and diagnosis of warm dense matters. Albeit covering only a relatively small time window of ∼10 picoseconds, systematic exploration of ultrafast dynamics in disordered matters on the picosecond scale with sub-100 fs time resolution through speckle visibility spectroscopy also becomes feasible.
We also note that the amplitude-splitting concept can be adopted to most existing splitdelay optical systems by the introduction of grating beam splitters up and down stream of the delay lines. While one would anticipate a reduction of available flux at the sample, this is more than compensated for by the improvement in mutual coherence which will increase the signal to noise and signal to background ratio for most cases significantly.
The all-channel-cut system can also, in principle, be expanded to cover larger delay time ranges by adopting artificial channel-cut crystals with longer reflecting surfaces. This will ease the crystal manufacturing requirements and potentially yield higher surface quality as well as more accurate crystal asymmetry angle control. With the introduction of moderate cooling to the crystals, we anticipate this as a viable path towards deploying the split-pulse XPCS methodology as a robust way for studying disorder and fluctuations at the atomic scale at the upcoming high repetition rate sources.

ACKNOWLEDGMENTS
Use of the LCLS at SLAC National Accelerator Laboratory, is supported by the U.S.

Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract
No. DE-AC02-76SF00515.
Appendix A: Grating Diffraction Efficiency Detailed calculation can be found in the code base [29].
The "Substrate" refers to the effective substrate thickness including the tilting angle of 72 • . The "Air" refers to the air gaps outside the helium cover and the effective gap inside the helium cover. The total path length inside the helium cover is about 120cm. Assume that the percentage of helium is 90%. Therefore, the estimated effective air path inside the helium cover is 12 cm. The "Kapton D" refers to Kapton films for the delayed branch while "Kapton DF" refers to the fixed branch. "Scintillator" refers to the two diamond scintillators upstream and downstream of the SD table for beam profile monitoring: each has a thickness of 30 µm. η refers to the absorption length and l is the length of the material.
The transmission is calculated with the formula exp (−ηl).
Note that in the table the uncertainty of the lengths is given with estimation rather than actual measurement. The corresponding uncertainty in the efficiency is assumed to be half of the change of the efficiency over the uncertainty region of the length. Therefore, with this table, the theoretical energy efficiency is calculated for the delayed and fixed branches respectively. For the delayed branch, For the fixed branch Therefore, the theoretical prediction of the total energy efficiency is 0.76% + 1.08% = 1.8%.
The uncertainty of the total energy efficiency is obtained with the standard formula for uncertainty propagation.

Measurement
Right after the second grating, the directly measured energy efficiency is 5.99(2)%. The 0.02% error is statistical. The relative systemic error here can be up to 10%. The energy efficiency is estimated to be 6.0(6)%. This 6.0(6)% throughput was obtained when the slit after the second grating was fully open. Assume that the angle of the second grating is 72(2) • and ±1 orders of diffraction have a transmission of 20(2)%. The total energy transmission efficiency parallel to the incident pulse and available for XPCS measurement is estimated to be 1.8(3)%. The details of the calculation can be found in the code base [29].

Energy Efficiency of Channel-cuts
It is challenging to directly measure the energy efficiency of the channel-cut crystals alone in current setting. Therefore, the channel-cut efficiency shown in the body text is obtained through estimation based on a model, which will be explained in detail below. In the experiment, compared with the incident pulse energy, the energy measured right after the SD Therefore, the energy efficiency of the channel-cut crystals is calculated to be 42(8)%.
The energy transmission efficiency of all other components can be represented as Detailed calculation of the uncertainty can be found in the code base [29]. Previously, we assume asymmetry angles are exactly 5.00 • for all asymmetric channel-cut crystals. Here We show the impact of a small mismatch between the asymmetry angles within a CC pair. Assume that there is a 0.02 • mismatch between CC2 and CC3, the 10 mm delay scan induces a 22 µm horizontal change of the relative position between the two pulses. If CC2 also has a 0.01 0 misalignment around it long axis out off the diffraction plane, then the 10 mm delay scan will lead to a 22 µm horizontal change and a 744 nm vertical change of the relative position. This can explain the cyclic relative motion during our delay scan. In a ∼ 10 ps delay scan, we observed a cyclic horizontal motion of the unfocused delayed branch beam. With the focusing optics, the position error is demagnified to be ∼200 nm peak to peak beam wobble, in agreement with our measurement shown in figure 9.

Grating Misalignment
The misalignment of the orientation of the first grating has very limited influence of the properties of the setup. Assume that G1 is misaligned by 1 • in the x − y plane, the position of the delayed branch focus will changes 27 nm horizontally and 12 nm vertically during a full-range delay scan. Therefore, for the alignment of G1, adjustment based on unfocused beams on scintillator screens should be good enough.