Learning and avoiding disorder in multimode fibers

Multimode optical fibers (MMFs) have gained renewed interest in the past decade, emerging as a way to boost optical communication data-rates in the context of an expected saturation of current single-mode fiber-based networks. They are also attractive for endoscopic applications, offering the possibility to achieve a similar information content as multicore fibers, but with a much smaller footprint, thus reducing the invasiveness of endoscopic procedures. However, these advances are hindered by the unavoidable presence of disorder that affects the propagation of light in MMFs and limits their practical applications. We introduce here a general framework to study and avoid the effect of disorder. We experimentally find an almost complete set of optical channels that are resilient to disorder induced by strong deformations. These deformation principle modes are obtained by only exploiting measurements for weak perturbations. We explain this effect by demonstrating that, even for a high level of disorder, the propagation of light in MMFs can be characterized by just a few key properties. These results are made possible thanks to a precise and fast estimation of the modal transmission matrix of the fiber which relies on a model-based optimization using deep learning frameworks.


Abstract
Multimode optical fibers (MMFs) have gained renewed interest in the past decade, emerging as a way to boost optical communication data-rates in the context of an expected saturation of current single-mode fiber-based networks. They are also attractive for endoscopic applications, offering the possibility to achieve a similar information content as multicore fibers, but with a much smaller footprint, thus reducing the invasiveness of endoscopic procedures. However, these advances are hindered by the unavoidable presence of disorder that affects the propagation of light in MMFs and limits their practical applications. We introduce here a general framework to study and avoid the effect of disorder in wave-based systems, and demonstrate its application for mutimode fibers. We experimentally find an almost complete set of optical channels that are resilient to disorder induced by strong deformations. These deformation principal modes are obtained by only exploiting measurements for weak perturbations harnessing the generalized Wigner-Smith operator.
We explain this effect by demonstrating that, even for a high level of disorder, the propagation of light in MMFs can be characterized by just a few key properties. These results are made possible thanks to a precise and fast estimation of the modal transmission matrix of the fiber which relies on a model-based optimization using deep learning frameworks.

I. INTRODUCTION
The description of light transport in Multimode Fibers (MMFs) has been widely studied since the '70s, with a complete analytical understanding available in the case of an ideal straight fiber [1]. However, imperfections of the fabrication, geometrical deformations, or changes of the environmental conditions introduce randomness that drastically modifies their transmission properties. When light injected in one mode statistically explores all the other modes with the same probability, i.e. in the strong coupling regime, some average properties can be predicted [2]. However, from a few centimeters to a few kilometers, typical MMF systems are neither in the no coupling nor in the strong coupling regime; disorder strongly influences light propagation but some aspects of the ordered behavior survive [3][4][5]. This intermediate regime has been little investigated so far due to the difficulty to experimentally characterize the effect of disorder on the modal content of the fibers.
Understanding the transition between these two regimes remains an important challenge for optical telecommunications, endoscopic imaging, and micromanipulation applications.
It is well known that injecting coherent light into an MMF results in the observation of a random pattern of bright and dark spots at the output, called speckle pattern. However, unlike scattering media, the observation of a speckle is not in itself a signature of disorder.
Indeed, perfect straight fibers also exhibit this property due to the existence of intermodal dispersion [6]. As long as multiple modes are excited, they quickly accumulate seemingly random relative phases leading to such complex interference patterns. In the past decade, the measurement of the Transmission Matrix (TM) emerged as a tool of choice to characterize and control the propagation of light in complex but deterministic optical linear systems.
Initially introduced in the context of scattering media [7][8][9], it consists in measuring the fieldfield linear relation between an input plane and an output plane. This concept was then applied to MMFs, unlocking new applications for endoscopic imaging [10][11][12], micromanipulation [13], quantum information processing [14], and for the control of optical channels for telecommunications [15]. It allowed in particular to demonstrate the robustness of the propagating modes in the case of short step-index fibers [6] and bent graded-index fibers [16].
However, the observation of the TM does not directly allow assessing the level of disorder, as dispersion and mode interference lead to the observation of a seemingly random matrix, even without disorder. Only when represented in the basis of the propagating modes does the TM allow us to fully capture the spatial propagation properties of the MMF. This can be done by directly injecting light and measuring the field in the mode basis [17] or by numerically projecting a TM measured in a basis of diffraction-limited spots [6]. In both cases, a good characterization is achieved only for the low order modes, as going into higher-order modes places increasingly demanding requirements on the beam quality and on the alignment [18].
A numerical post-treatment was demonstrated [6] to correct the TM measurement, but it still requires a careful and time-consuming procedure. Moreover, such an approach assumes that there is little to no disorder in the fiber, which forbids the study of the transition from the weak to the strong mode coupling regimes. One of the main challenges of practical applications of multimode fibers is not only to understand the effect of disorder, but to avoid it altogether. In this context, the time-delay operator introduced in quantum mechanics by E. Wigner and F. Smith [19,20] has recently attracted renewed interest among the complex media community. For a lossless optical system characterized by its scattering matrix S, which links all input channels to all output ones, the Wigner-Smith operator is constructed using the frequency derivative of S, and defined as Q = −iS −1 ∂ ω S. Interestingly, the eigenstates of this operator, also called principal modes, are insensitive to small variations of the frequency. The possibility to use wavefront shaping techniques to generate those input states opened new applications to improve some properties of light transport, such as to generate particle-like wave packets in chaotic cavities [21] and in scattering media [22,23], or to optimize energy storage in scattering media [24].
For MMFs, the scattering matrix can be approximated by the TM, whose measurement gives access to the principal modes. In the context of telecommunications, they are particularly attractive as they do not suffer from modal dispersion to the first order [25]. Their ability to be stable over a large bandwidth was observed in the case of weak [15] and strong disorder [26]. The possibility to find channels invariant to small modifications can be extended to other parameters than the frequency using the Generalized Wigner-Smith (GWS) operator [27,28]. These studies focused on the interaction between waves and localized targets in scattering environments in the microwave regime.
In the present paper, we first introduce a new approach that relaxes most of the experimental constraints on the procedure to measure quickly and accurately the TM of an MMF in the mode basis. It uses numerical tools based on a modern machine learning framework.
We demonstrate the ability to use the knowledge of the TM for small deformations to find an almost complete set of channels using the GWS operator, the deformation principal modes, that are insensitive to strong perturbations. To understand this effect, we show that, all across the deformation range, the evolution of the TM can be characterized by only a few parameters that account for the mode coupling between close-by propagating modes.

II. MISALIGNMENT AND ABERRATION ROBUST CALIBRATION
We first define the TM H pix measured in the pixel basis of a modulator and a camera, respectively located in planes conjugated with the input and output facets of a fiber. Leveraging a fast digital micro-mirror modulator and InGaAs camera, we estimate H pix with a 1 kHz frame rate in about 10 seconds. The principle of the experiment is presented in Fig. 1a and detailed in Appendix A. Ideally, the mode basis representation of the TM can simply be recovered using: where M i (resp. M o ) represents the change of basis matrix between the input (resp. output) pixel basis and the mode basis of the fiber (see Appendix D for the theoretical mode calculation). We use the orbital angular momentum modes basis, in which the modes are defined by a pair of indices l and m, characterizing respectively to the radial oscillations and the orbital angular momentum. However, in the presence of slight aberrations or misalignments, the change of basis matrices cannot be inferred only from the calculation of the theoretical mode profiles of the fiber. It leads to strong unwanted distortions of the mode basis TM, even with a carefully tuned setup [6]. Such an effect also occurs when working directly in the mode basis, as injected and detected modes, affected by the aberrations and misalignments, are different from the true ones. To overcome this problem, we designed a numerical procedure based on the neural network framework PyTorch [29], taking advantage of Graphics Processing Units (GPUs) for optimized computational times. Unlike neural networks, that consists of generalist layers, typically dense or convolutional layers, we use a model-based approach. Our network is composed of custom layers, that each mimics the effect of an aberration by applying a Zernike phase polynomial to the change of basis matrix.
The structure of the network is represented in Fig. 1d (Fig. 1c), about 94% of the energy is conserved. Moreover, the matrix shows a strong diagonal, which traduces a weak mode-coupling effect. 92% of the energy is in the block diagonal, representing the groups of degenerate modes (in green in Fig. 1d). It is important to stress that the optimization process only maximizes the total energy in the mode basis, the observed strong diagonal appears naturally. The procedure leads to accurate corrections regardless of the level of disorder (see Supplemental Material S5 for reconstruction comparison for different levels of disorder).

III. PERTURBATION INSENSITIVE CHANNELS
To learn how to be insensitive to disorder, we first characterize the full mode basis TM of an MMF when we introduce and gradually increase a perturbation. We apply a controlled deformation on the fiber along an axis orthogonal to the propagation direction (Fig. 2).
Deformations of the fiber core leads to mode dependent losses, back-reflection, and mode coupling that hinders telecommunication applications. Qualitatively, strong deformations have the effect of progressively populating the off-diagonal elements of the TM while reducing the energy on the diagonal. Thanks to the precise modal projection, we could observe for the first time the crossover from a nearly diagonal TM (weak coupling) [6] to a seemingly random TM (strong coupling) [26].
We first compute the correction for the TM of the unperturbed fiber using our aberration compensation approach. We then apply the same correction parameters for the measure- that are little affected by them. In the present work, the parameter of interest is the induced displacement ∆x, we then study the GWS operator defined as: we represent a sketch of the deformation procedure.
The second term appears due to the fact that H modes is not unitary [30]. We estimate the GWS operator for a small deformation ∆x = 14 µm. The derivative is numerically estimated using the approximation: We chose δx = 8 µm to mitigate the effect of noise that appears for smaller differences δx.
Its eigenmodes, referred to as the deformation principal modes, are theoretically insensitive to the deformation parameter ∆x to the first order. We numerically compute their input profiles, and compare the output intensity patterns across the full range of deformations.
The stability of the deformation principal modes is shown in Fig. 3 as a function of ∆x.
For comparison, we also test the injection of the fundamental mode, which is less affected by disorder than the other modes [31], and random wavefronts (the correlation is averaged over 20 realizations). The best deformation principal modes keep a correlation above 95% over the whole range of deformations compared to the output profile for no deformation.
Moreover, all the principal modes but 4 perform better than the fundamental mode. It is important to note that, while the GWS operator is only estimated for small deformations,

IV. DISCUSSION
To further investigate how the deformation principal modes, computed from the TMs for small deformations, can efficiently cancel the effect of large deformations, we study the deformation matrix defined as: This matrix quantifies how H modes (∆x) deviates from H modes (∆x = 0). It is equal to 0 if the TM remains unchanged. We want to determine the main characteristics that best describe how the TM is modified when the perturbation is applied. We then define a deformation operatorD that links each value of the deformation ∆x j to the corresponding deformation matrix D j . It characterizes the full evolution of the deformation of the matrix over the range of deformations applied. We compute the singular value decomposition of the operatorD, it amounts to performing a principal component analysis. The deformation plays the same role as the different realizations in standard principal component analysis implementations (see Appendix C). We show in Fig. 4a that the first two singular values amount to more than 96% of the total energy of the operator. The corresponding singular components are represented in Fig. 4b for one pair of input and output polarizations. As the deformation operator is computed for the whole range of deformations, the first principal components characterize the most important modifications applied to the TM during the deformation. It has been shown that, for low perturbations introduced by thermal fluctuations, the distortion of the TM can be parametrized by only one parameter [32]. To test here if the TM of a fiber under strong deformations can be parametrized by only a few parameters, we approximate the transmission matrix using just the first two components where α j are β j are directly extracted from the singular value decomposition (see Appendix C). We show in Fig. 4c the fidelity between the estimated matrixĤ modes (∆x j ) = H modes (∆x = 0). D j + I and the measured one. Surprisingly, all across the range of deformations, the TM can be estimated using only two parameters with a fidelity above 93%.
We can give a qualitative interpretation of the two significant components. U 1 is close to identity, traducing the loss of energy in the diagonal compared to the reference TM at ∆x = 0. It is equivalent to the decay of the ballistic light in the presence of a scattering environment in free space. The second vector U 2 shows a well defined symmetric pattern that corresponds to an energy conversion between modes with close-by radial and angular momenta l and m (see Fig 4b, d). This is consistent with the previous observations of mode coupling [33] in bent graded index fiber. It corresponds to photons being injected in one mode and leaving the fiber in a close-by mode, which corresponds to photons whose direction has been modified once by the perturbation. This phenomenon is analogous to the conversion between ballistic and single scattered photons in scattering media. This physical interpretation is made possible thanks to the precise correction of the aberrations that would otherwise destroy the symmetries of H modes .
The fact that the TM can be estimated precisely using only two terms, only one of them accounting for mode coupling, is counter-intuitive considering the fact that H modes shows a seemingly random aspect for high order modes at large deformations (see Fig. 2c). Coupling between modes further away in the l and m space can occur, it is the equivalent of multiple scattered photons in scattering media. However, strong mode coupling also comes with important losses due to coupling to non-guided modes that leak out of the fiber [31], leading a radial order l difference equal or lower than 1 and an orbital momentum m difference equal or lower than 1. c, Fidelity between the measured TM in the mode basis and the approximated one using Eq. 5 (blue). The fidelity is defined similarly as in the caption of Fig. 2b.

V. CONCLUSION
To summarize, we present a framework to study the effect of disorder in MMFs, allowing, in a matter of seconds, to fully characterize light propagation in the mode basis. As precise predictions for the effect of perturbations on multimode fibers in real-life situations is currently lacking, our approach provides a way to quantify such perturbations using measured transmission matrices and could serve as a benchmark for the study theoretical models. We harness this approach to observe for the first time the existence of deformation principal modes, that are robust against strong deformations, and that can be found by only using the knowledge of the fiber properties for small deformations. This can be explained by the predominance in the transmission properties of the coupling between nearby modes, even for large deformations. We emphasize that our framework is general and can be used to study any linear propagation system regardless of the presence or the type of perturbations.
Moreover, as the complexity of handling the effect of the aberrations and misalignments in the TM estimation is rejected onto a fast automatic post-processing, our approach is virtually robust to any optical system imperfections, allowing plug-and-play operations suitable for real-life applications. The modulation of the input field is achieved using the Lee hologram method [36]. It allows performing complex amplitude modulation using a binary amplitude DMD [37]. The We can then create three complex amplitude states; 0, 1 and −1. Sequences of patterns are generated and sent to the control board of the DMD where they are stored in the on-board memory. The sequence is then displayed at a 1kHz frame-rate on the DMD, which triggers the acquisition of the frames on the camera.
A tutorial on the Lee holograms is made available on our website: [38]. This modulation procedure has been implemented in the Python module SLMlayout [39] and the interface control of the DMD was done using the Python module ALP4lib [40]. We developed, share, and maintain both packages.
The complex output field is measured using an off-axis holographic technique [35]. We share a tutorial on the off-axis holography and some sample codes on our website: [38]. The complex field is simultaneously measured for the two orthogonal circular polarization states.
Let's call X (resp. Y) the matrix that represents the stack of vectors X i (resp. Y i ).
Eq. B1 can be rewritten: An estimationĤ pix of the TM can be found by using each vector of the the canonical basis for the input excitation patterns, i.e. using X = I. It gives direct access to the TM usingĤ pix = Y. One can also use any orthogonal basis, such as the Hadamard basis that is convenient for phase-only modulation [7], so thatĤ pix = Y.X −1 . However, in the presence of noise, or if one or more measurements fail, the quality of the reconstructed matrix is significantly altered. To mitigate those effects, we chose to use a set of random vectors X i with N masks > N in pix . We can then estimate the TM using: where . + represents the Moore-Penrose pseudo-inverse. We chose X i to be random patterns where the modulation on each pixel can take the value 0, −1 or 1. For each pattern, the percentage of off pixels, i.e. taking the value 0, is drawn from a uniform distribution between 60% and 80%. The percentage of on pixels taking the value 1 is drawn from a uniform distribution between 40% and 60%, the other pixels taking the value −1. The positions of the pixels are random. We chose N masks = 6 × N in pix = 7350 to ensure the existence and the stability of the pseudo-inverse of X.
By changing the input polarization state, we measure separately the two corresponding sub-matrices. They are finally combined into a large matrix of size 2N out pix × 2N in pix .

APPENDIX C: SINGULAR VALUE DECOMPOSITION OF THE DEFORMA-TION OPERATOR
We first reshape the stack of the matrices D j as a 2-dimensional matrixD of size Λ is a diagonal matrix of size N ∆x ×N ∆x containing the singular values, whose distribution is represented in Fig. 4a. U is a matrix containing the corresponding output singular vectors . They can be reshaped as 2-dimensional matrices U i of size N modes ×N modes .
The coefficient α j and β j in Eq. 5 are then expressed by: We represent in Fig. 6a the evolution of the absolute value of the coefficients of α and β as a function of the deformation. The contribution of α is dominant for small deformations and globally decreases as the deformation increases. Conversely, the contribution of β is small for small deformation and globally increases with the deformation. This trend confirms that the first effect to appear is the loss of energy on the diagonal, due to the effect of U 1 , and then the coupling to neighboring modes in the momentum space, due to the effect of U 2 . We observe that this global trend is modulated by a periodic oscillation. The beating between the two contributions can be attributed to the fact that the two physical effects are not fully decoupled in the two operators as U 2 also has significant energy on the diagonal, that modifies the energy of the ballistic photon similarly to U 1 . It has been shown that, in addition to mode coupling, deformations are associated with a global rescaling of the fiber which induces phase shifts that dominate for small deformations [6,16,32]. We represent in Fig. 6b the evolution of the phase on the diagonal of the modes basis TM.
We observe an oscillation with the same periodicity as the beating between α and β. As the perturbation increases, one expects higher-order coupling effects to become significant in Eq 5, which would be associated with the coupling between modes further away in the momentum space. However, such an effect increases the chance for the photons to couple to non-guided modes, leading to losses [31]. We show in Fig. 6c the variation of the energy of the mode basis TM as a function of the deformation. Losses increase with the deformation up to approximately 2.5%, confirming that higher-order coupling effects are still weak in this regime. We restrict ourselves in this study to deformations in the elastic regime of the material, higher deformations leading to non-reversible perturbations and permanent damage of the fiber.

APPENDIX D: CALCULATION OF THE THEORETICAL MODES
The starting point of the mode projection operation is to consider the ideal modes of the fiber. We want to estimate the modes profiles of a perfect straight graded-index fiber under the scalar approximation. Graded-index fiber mode profiles and dispersion relation do not have a closed-form analytical expression. However, approximate analytical expressions can be found, for instance, using perturbation theory or a variational approach [41].
Arguably the most widely used approximation is the Wentzel-Kramers-Brillouin (WKB) approximation. It leads to an analytical dispersion relation when assuming an infinite quadratic spatial profile of the refractive index. While leading to accurate estimations of the propagation constants, it has a limited accuracy for the expression of the spatial mode profiles [42,43], especially for low radial numbers l. Finite difference methods are easy to implement numerically, but the 2D discretization of the field leads to high memory requirement and computation time, and could lead to inaccuracies for high order modes. Because we consider axiosymmetric index n(r) profiles, we want to simplify the system to solve a 1D problem that only depends on the radial coordinate r, allowing us to increase the accuracy and decrease the computation time.
The 2D scalar Helmholtz equation for a propagating mode can be written in the cylindrical coordinate system as where ψ is the optical field, φ is the azimuthal coordinate, β is the propagation constant, and k 0 = 2π/λ with λ the wavelength.
Because the refractive index only depends on the radial coordinate r for a perfect gradedindex fiber, we can separate the variables r and φ. We are then looking for the orbital angular momentum modes of the form: with l the radial order and m the azimuthal order, which also corresponds to the orbital angular momentum. Injecting this expression in equation D1 leads to the 1D equation The singularity at r = 0 arising from the 1 r term makes direct finite difference methods unstable. We can use the transformation: and rewrite equation D3 as a quadratic Ricatti equation [44]: d r g l (r) + P (r) + Q(r)g l (r) + g 2 l (r) = 0, where A finite difference approximation of such equation leads to the recursive expression [44,45]: where g n l = g l (r n ), Q n = Q(r n ), P n = P (r n ), and h = r n+1 − r n is the step size.
The expression D4 can then be discretized as: To find the first steps to initialize the iteration, we need to consider the boundary conditions at the center of the fiber core: This procedure has been implemented in the Python module pyMMF [46] that we developed and share. Sample codes to compute the ideal modes of the MMF are available at the dedicated repository [34].

APPENDIX E: MODEL-BASED OPTIMIZATION FOR THE COMPENSATION OF THE ABERRATIONS AND MISALIGNEMENTS
Recent attempts were made to tackle the problem of modal decomposition using deep learning frameworks. As they used model-free neural network models, using standard convolutional [47] or dense layers [48], these systems require large training sets and significant amounts of memory. Moreover, computational times and limited accuracy forbid their use for more than 10 modes (the training for 10 modes took about 43 hours with a > 300,000 image training set in [48]). We developed here a model-based approach that only learns a few relevant parameters, is fast (a few seconds) to converge, and only requires one TM measurement. The general principle is to apply to the change of basis matrices M i and In order to implement our model, we first need to use complex-valued matrix operations.
However, complex tensors are not natively supported by the PyTorch framework we use. To do so, we add a dimension to our data structure of size 2 to encode the real part and the imaginary part of the complex values. We then create a set of elementary operations: complex conjugation, element-wise, and matrix multiplications. The key parts of our approach are the layers that mimic the effect of aberrations represented by Zernike polynomials. The input of each layer is a batch of complex 2D images of size N modes × N pix × N pix × 2. The effect of a layer Z k , corresponding to the k-th Zernike polynomial, is to add, to each 2D image, a phase contribution. It amounts to transforming each input image K ij , into a modified one K ij using: where F k (r, φ) is the k-th Zernike polynomial, r ij and φ ij are the polar coordinates corresponding to the pixel indexed by i and j, and α k is the weight of the aberration. α k is the only trainable parameter of the layer. The layer automatically calculates and stores the derivative of the output tensor with respect to this parameter, as required for the training process (backpropagation). By adding multiple Zernike layers, we simulate the effect of a high level of aberration. We perform a Fourier transform in the spatial dimensions and add other Zernike layers to simulate aberrations in the Fourier plane (see Fig.1d of the main text). The first Zernike polynomials correspond to phase slopes in the x and y directions and to a parabolic phase. When applied in the Fourier plane, they introduce spatial shifts in the x and y directions and a defocus. It allows compensating for misalignments in the x, y and z directions. Finally, we add a transformation T that applies a global scaling transformation in the spatial dimensions. The scaling parameter is the only trainable parameter of this layer.
We treat separately each combination of input and output polarizations. For each optimization, we train simultaneously two models, one for the input and one for the output change of basis matrix. The input data corresponds to the matrices M i and M o of respective size N modes ×N in pix and N modes ×N out pix that we compute using the approach detailed in the previous section. We convert and reshape them as PyTorch tensors of sizes N modes ×N in x ×N in y ×2 and N modes × N out As explained in the main text, we know that an ideal compensation of the aberrations corresponds to maximizing H modes , with . representing the L 2 norm (Frobenius norm) of a matrix. We choose as the cost function to minimize: where H pix is the experimentally measured pixel basis TM.
Finally, we run an optimizer based on a stochastic gradient descent approach (Adam optimizer [49]) to find the set of parameters ( the direct plane, 14 in the Fourier plane) for different values of the deformation. The coefficients are similar except for some variations, mostly for the astigmatism aberrations in both planes. We attribute these fluctuations to the different degrees of freedom not being totally independent, hence different combinations could lead to the same transformation of the modes. To confirm that the resulting transformations are indeed similar, we compute the corrected mode basis TM for ∆x = 16 µm using the correction obtained for ∆x = 16 µm (Fig. S6d) and using the correction obtained for ∆x = 0 µm (Fig. S6e). The error between the TMs, defining the quadratic error between two matrices A and B as Err = |A| − |B| 2 / |A| |B| , is about 3 %. The fidelity, defined as F c = T r(|A.B † | 2 )/ T r (|A| 2 ) T r (|B| 2 ), is about 94 %. The remaining difference between the two reconstructions can be attributed to actual changes in the physical system, such as a longitudinal displacement of the fiber when the pressure is applied. This explanation is supported by the variations in the defocus coefficient in the Fourier plane as=t the output of the fiber.