Characterising optical fibre transmission matrices using metasurface reflector stacks for lensless imaging without distal access

The ability to form images through hair-thin optical fibres promises to open up new applications from biomedical imaging to industrial inspection. Unfortunately, deployment has been limited because small changes in mechanical deformation (e.g. bending) and temperature can completely scramble optical information, distorting images. Since such changes are dynamic, correcting them requires measurement of the fibre transmission matrix (TM) in situ immediately before imaging. TM calibration typically requires access to both the proximal and distal facets of the fibre simultaneously, which is not feasible during most realistic usage scenarios without compromising the thin form factor with bulky distal optics. Here, we introduce a new approach to determine the TM of multi-mode fibre (MMF) or multi-core fibre (MCF) in a reflection-mode configuration without access to the distal facet. A thin stack of structured metasurface reflectors is used at the distal facet to introduce wavelength-dependent, spatially heterogeneous reflectance profiles. We derive a first-order fibre model that compensates these wavelength-dependent changes in the TM and show that, consequently, the reflected data at 3 wavelengths can be used to unambiguously reconstruct the full TM by an iterative optimisation algorithm. We then present a method for sample illumination and imaging following TM reconstruction. Unlike previous approaches, our method does not require the TM to be unitary making it applicable to physically realistic fibre systems. We demonstrate TM reconstruction and imaging first using simulated non-unitary fibres and noisy reflection matrices, then using much larger experimentally-measured TMs of a densely-packed MCF, and finally on an experimentally-measured multi-wavelength set of TMs recorded from a MMF. Our findings pave the way for online transmission matrix calibration in situ in hair-thin optical fibres


I. INTRODUCTION
Lensless imaging through hair-thin optical fibres is a technique that promises to open up new areas in biomedical imaging such as: in vivo bright-field, dark-field, and fluorescence microscopy [1]; quantitative phase and polarimetric imaging with applications in early detection of cancer [2,3]; and endoscopic confocal microscopy for high-resolution imaging [4].
For all these applications, accurate characterisation of the deterministic propagation of light through the fibre, discretised as the transmission matrix (TM) [5], is essential for imaging, whether used directly [3], indirectly via phase conjugation [6] or via optimisation [7].
Unfortunately, the TM is highly sensitive to small changes in mechanical deformation and temperature that are unavoidable in living subjects, so the TM should be calibrated immediately before imaging. The most direct way of measuring a TM is to send pre-defined optical fields into the distal facet of the fibre (furthest from the operator) and measure the resulting fields at the proximal facet (nearest the operator). This widely-used approach has a critical limitation: the distal facet of the fibre is deep within the subject during imaging so the pre-defined optical fields cannot be reliably generated without additional bulky distal optics that would compromise the ultra-thin form factor.

Several methods have been proposed to overcome this limitation and open new avenues
for minimally invasive optical imaging in inaccessible areas of the body, e.g. deep in the brain where the fibre may need to curve for safe access [8]. Placing a holographic plate on the distal facet of the fibre, which is illuminated via a physically separate single-mode fibre, can be used to create a 'virtual beacon' [9]. A phase-conjugated version of the resultant proximal field is used to recreate a single focussed spot at the distal facet. This spot is typically fixed in position but with a multi-core fibre (MCF), the 'memory effect' could be exploited to enable scanning over a small distances [10]. This method is only able to partially recover the TM and so while confocal imaging is possible, wide-field imaging (e.g. quantitative phase) is not achievable. Further, the addition of a single-mode fibre to illuminate the beacon adds bulk to the system.
Highly accurate modelling of TM perturbations has also been shown to be feasible, but requires precision manufactured fibres such as telecommunications OM4 grade multi-mode fibre (MMF), is limited to short distances (<10cm) [11] and needs very precise bending characterisation, for example through the addition of a Bragg-grating shape sensing fibre [12,13]. Another recently demonstrated method uses reflectors at the distal end to introduce time delays to resolve the fibre TM [14]. However, this adds undesirable complexity to the system and also requires a unique reflector for each characterised propagation mode, which does not scale well for high-resolution imaging applications. Gu et al. propose a more generic approach to in situ TM characterisation: to infer the forward TM based on the reflection matrix (RM) i.e. light that has made a roundtrip into the proximal facet, off a distal reflector and back out the proximal facet [15].
Applied naively, this approach suffers two major limitations. The first arises due to the transpose symmetry of fibre TMs, which means that measured RMs exhibit a quadratic relationship with their respective TMs, resulting in ambiguities during recovery [15,16]. In some cases this ambiguity reduces to an N -dimensional vector of sign errors (i.e. a vector ∈ {−1, 1} N ) for an N -pixel image, for example, when using MCF with low core-to-core coupling [17] or using an MMF that is near perfectly unitary (via the Autonne-Takagi matrix factorisation) [15]. In such cases, the ambiguity may be resolved by in situ optimisation assuming prior knowledge of the sample [17] or by treating the ambiguities as invariants of the fibre that can be measured in advance [15]. In practical cases, however, TMs are not unitary even in precision-manufactured MMF [18,19] because high-order modes, with more power concentrated in the fibre cladding, tend to exhibit power leakage at bends [20].
The second limitation arises because light must be able to pass through the distal facet of the fibre to enable imaging. Gu et al. address this by proposing to include a shutter at the distal facet, but again, this added bulk would compromise the ultra-thin form factor [15].
Seeking to overcome the need for precision-manufactured MMFs, while maintaining the hair-thin property of the fibres, we demonstrate here a new method that enables measurement of non-unitary fibre TMs of arbitrary size without access to the distal facet and without adding distal bulk, making it suitable for ultra-thin flexible imaging devices. The key innovation is a distal reflector comprising a multi-layer stack of spatially heterogeneous partially reflecting wire-grid polarisers (termed 'metasurfaces') and long-pass optical filters. The filter stack creates a spatially and spectrally varying reflector, whose properties can be characterised prior to use and will remain fixed throughout any imaging experiment. By combining multiple RMs recorded at different wavelengths it is possible to estimate the instantaneous TM and then perform imaging at another wavelength.
We first introduce a new method that analytically extracts the TM based on 3 measured RMs at different wavelengths and prior knowledge of the reflector properties (the 'zerothorder method'). We then extend the physical model to account for realistic path-length differences and present an iterative numerical method that is able to correct analytically derived TMs to satisfy this extended model (the 'first-order method'). Neither method requires that the TM be unitary, only that it is invertible, meaning they could be applied to lossy fibres or general scattering media. Following TM reconstruction, we then show how imaging can be performed at a fourth wavelength.
Next, we computationally demonstrate the first-order method using simulated 32×32 non-unitary fibre matrices with noise. We then demonstrate the recovery algorithm using measured MCF TMs (1648 × 1648) and realistically simulated reflector stacks, showing the method can be scaled up effectively. Finally, we demonstrate TM recovery using a multiwavelength set of TMs measured from a step-index MMF validating both that the method works for real MMF and that the first-order physical model used to predict changes in TM with wavelength is physically realistic. Our findings underpin a compelling new approach for enabling ultra-thin flexible lensless imaging via optical fibres.

A. Formulation of the physical model
The physical model used for reference in this paper is shown in Figure 1. We consider a well-established model where a field containing M pixels, each complex-valued to encode amplitude and phase, in two orthogonal polarisations is recorded. The field at some input plane (e.g. the proximal facet), is represented by a vector x ∈ C 2M , and the field at an output plane (e.g. the distal facet) is represented by a vector y ∈ C 2M . In the forward propagation direction, these vectors are related by the fibre TM, A ∈ C 2M ×2M : This equation can be derived by discretisation of the linear propagation relationships between the two planes [5,21]. For example, a continuous optical field might be sampled using delta functions at M spatial locations in two orthogonal polarisation states using a Jones vector formalism [2,22]. The complex-valued samples, representing amplitude and phase across two polarisations, can then be ordered to produce the vector x with 2M elements [3]. Similarly, if we consider a field, x ∈ C 2M , at the distal facet propagating in the reverse direction to become a field, y ∈ C 2M , at the proximal facet, these are related by the transpose of the fibre TM [15]: We then consider the impact of adding a reflector at the distal facet of the fibre. In general, the reflector is considered to be spatially heterogeneous in terms of its localised at each spatial point. Further, if the reflector is offset from the fibre, light may couple between spatial positions due to diffraction. This behaviour is linear and so is represented by a partial reflector matrix (PM), R ∈ C 2M ×2M , that relates y and x (see Figure 1): Next, combining Equations 1, 2 and 3 we can determine the optical field exiting the proximal facet in the reverse direction: where C = A RA ∈ C 2M ×2M is termed a 'reflection matrix' (RM). Physically speaking, C represents light taking a complete round-trip (or double-pass): forwards down the fibre, off a given reflector and back up the fibre, as illustrated in Figure 1. When C is measured in practice, the measurements may lead to a non-square approximation to C, denoted as C, henceC first needs to be downsampled into a square matrixC, which is then used as a surrogate for C in what follows (see Appendix A 2).
As our first objective is to estimate the TM of the fibre A we would ideally use pairs of measured vectors, (x, y ) through standard a calibration procedure [2,3,19]. Since we now only have access to the proximal facet of the fibre, we instead need to estimate the RM, C = A RA, based on the measured vectors (x, y). Given that we can characterise R in advance (Appendix A 3 a) and it will remain fixed throughout use, the goal is then to recover A based on measurements of C and prior characterisations of R, however, with only this information it is not in general possible to recover A unless the fibre TM is unitary [15,18]. We therefore propose a novel approach that uses several different reflectors, with PMs R n , n = 1..Q, and show that this enables unambiguous recovery of A in the more general non-unitary case.
For experimental simplicity, it would be desirable to use as few different reflectors, R n , n = 1..Q, and associated RMs, C n , n = 1..Q, as possible for TM characterisation and to switch between these without any physical moving parts, for example, by modulating wavelength. The case with a single reflector reduces to the method presented by Gu et al. [15] and does not produce unique solutions for non-unitary TMs, A. Therefore, we need Q ≥ 2 reflectors.
We will consider in what follows the case of three reflectors, i.e. Q = 3. For TM characterisation (with reference to the appropriate optical path in Figure 1) there will be three associated PMs R n , n = 1..3. The TMs associated with each reflector during characterisation, A n , n = 1..3, may in general be different, for example if wavelength modulation is used to activate different reflectors. We can then record 3 associated RMs, C n , n = 1..3 measured for example using the process of Appendix A 3 b. Based on Equation 4, we express these RMs in terms of the PMs R n , n = 1..3 and associated TMs A n , n = 1..3 as: In order to solve these equations for A n , n = 1..3 we must assume a known relationship between these TMs, for example a wavelength-dependent phase shift, and must also have measured R 1 , R 1 and R 1 a priori using a characterisation process such as that in Appendix A 3 a.
Following TM recovery, the final objective is to performing imaging. This involves using an extended optical path ( Figure 1) in which light propagates down the fibre via a possibly different fourth TM, A 4 , is partially reflected off the reflector via a PM, R 4 , and is partially transmitted through the reflector via a TM, A ref l . R 4 and A r ef l can be measured a priori using a characterisation process such as that in Appendix A 3 a. Following possible further propagation and reflection off the sample, the returned light at the proximal facet is used for image reconstruction.
Again, a known physical relationship is required to recover A 4 from A 1 , A 2 or A 3 and hence reconstruct images. We therefore consider two simple models to approximate this relationship between the calibration and imaging TMs and enable image recovery: a 'zerothorder model' and a 'first-order model'.

B. Zeroth-order model
The zeroth-order model assumes that the TMs under the different reflectors are the same, i.e.: This could correspond to physically switching reflectors without disturbing the fibre TM.  [23] so this linear approximation is limited to be within the 'spectral bandwidth' of the fibre. We can then use a coupled-mode theory treatment of optical fibres [24,25]. This perturbational approach models a length of MMF as a sequence of infinitesimal segments that introduce field coupling. The differential change in field in each mode can be modelled by a vector, dm ∈ C 2M ×1 , such that: where dA ∈ C 2M ×2M represents an infinitesimal coupling matrix, and m ∈ C 2M ×1 represents the field in each of M spatial modes over 2 orthogonal polarisations. This matrix equation represents a system of 2M linear differential equations. The solution to these equations for any arbitrary input set of modes, x, after travelling a distance along the central axis of the fibre is given by a matrix exponential [24,26]: Equivalently, this model can be thought of as the Lie algebra dA constructing the Lie group of invertible matrices e dA ∈ C 2M ×2M [27]. We can estimate dA from the TM at characterisation wavelength λ 1 , A 1 as: where log A 1 represents a matrix logarithm and 1 is the optical path length along the central fibre axis. In general, matrix logarithms of complex matrices produce degenerate solutions, analogous to logarithms of complex numbers. This means that in estimating dA each eigenvalue, γ, has an equivalence class of degenerate forms: γ + i2πn/ 1 , n ∈ Z.
However, under the bandwidth-limited scenario presented here, the change in wavelength results in an overall relative phase shift between any two matrix elements of < 2π (Appendix E) and so therefore n = 0 is the appropriate choice.
If we assume constant dispersion within the relatively small (< 10nm) spectral bandwidth considered, a reasonable assumption for typical glasses [28], we can find the equivalent optical thicknesses at the other characterisation wavelengths (λ 2 , λ 3 ) and imaging wavelength (λ 4 ) relative to the first characterisation wavelength (λ 1 ): Rearranging and expressing in terms of A 1 we can write A key requirement for this model is that the wavelength sweep range is kept within the spectral bandwidth of the fibre, i.e. the bandwidth within which far-field speckle patterns from a fibre remain correlated [29] or equivalently over which the eigenmodes of the fibre remain approximately the same. We show in Appendix E that this implies the maximum change in path length for any mode is < 2π.
The eigendecomposition of dA is: where Q ∈ C 2M ×2M is a matrix whose columns, q m with m = 1..2M , are the eigenvectors of dA and Λ m with m = 1..2M are the corresponding eigenvalues. The eigendecomposition of A n with n = 2, 3, 4 is therefore [30]: In general, matrices constructed using matrix exponentials (as in Equations 15 -17) share the same eigenvectors but the corresponding eigenvalues are related by an exponential [30].
Our model therefore assumes that the eigenvectors of the fibre do not change appreciably over the selected bandwidth, which is consistent with previous experimental work [31]. This is in contrast to the zeroth-order model that requires the additional assumption that eigenvalues stay constant within the spectral bandwidth.
It is not straightforward to solve Equation 5 to Equation 7 analytically to obtain A 1 , A 2 , A 3 or A 4 when they are related by matrix exponentials as in Equations 15 -17.
Therefore, we present an optimisation-based approach. We begin by using the zeroth-order approximation to obtain an approximate solution for A 1 , termedÂ 1 . This is achieved by performing an optimisation based on Equation 5: where .. F indicates the Frobenius norm. The solution to this optimization problem,Â 1 , may be found using an iterative gradient descent solver initialised by the zeroth-order approximation. We know that because the true A 1 may be non-unitary, this equation may not have a unique solution [15]. We therefore use the result of Equation 20,Â 1 , to generate an estimate of A 2 , asÂ 2 = e λ 1 λ 2 log(Â 1 ) . This is then used as the starting point for a second optimisation:Â Finally,Â 3 = e λ 2 λ 3 log(Â 2 ) is used as the starting point for a third optimisation: To complete the iteration,Â 1 = e The algorithm presented here is observed to converge reliably to the correct solution and has the computational advantage that the relatively expensive matrix logarithm operation is performed only between minimisations, as opposed to in each objective function. It is therefore used for the remainder of this paper.

D. Imaging
Considering the case in which reflectors are switched by modulating wavelength and the first-order model is used to reconstruct the fibre TM at the imaging wavelength (λ 4 ),  There are two approaches to illuminating the sample via the imaging fibre. The first is to use the measured TM at the imaging wavelength, A 4 , to produce a known distal illumination profile. This produces controllable illumination profiles that could enable, for example, various structured illumination techniques but assumes that the TM reconstruction process is complete before imaging. In cases where computing the TM introduces an undesirable time delay, a known proximal illumination profile could used to produce a semi-random distal illumination profile that can later be corrected by averaging over several random illuminations, a technique known as speckle averaging [32], or by using knowledge of A 4 reconstructed offline. Considering our physical model (Figure 1), we project a random or pre-calculated illumination vector, x illum , onto the proximal facet of the fibre, giving at the distal facet: Next, the light passes through the full reflector structure, which in the imaging configuration must be partially transmissive. In general ×2M . This allows for oversampled illumination fields. The field exiting the reflector stack is then: with y illum ∈ C 2N . This can be propagated by some linear operator, G ∈ C 2N ×2N , through free-space (e.g. Fresnel or Fraunhofer propagation) before reaching the target where the field will be: In general G is parameterised by the distance, d, between the subject and the distal surface of the reflector structure ( Figure 1. In general, d is not known a priori and so must be estimated during operation, for example using an automatic focus detection algorithm such as the Brenner algorithm [3,33]. For simplicity we will assume the target is a reflective surface, though a volume scattering target could be characterised using several known illumination profiles in sequence as in spatial frequency domain imaging [34]. We therefore let the target be represented by vector, r target ∈ C 2N . An element-wise (Hadamard) product with the illumination gives: This in turn, produces a field at the distal fibre facet of: Next, we consider the illumination light that is reflected back from the reflector stack: We sum the two fields and propagate back through the fibre giving: The goal is to recover r target from y total , the raw measured data. To do this, we first recover x target by rearrangement of Equation 29 and substitution of Equations 27 and 28: where (..) + represents a generalised inverse because A ref l may be non-square. Quantities on the RHS of Equation 30 are known prior to imaging, except for y total which is recorded during imaging and possibly G, which may need to be estimated as described above. Therefore, we could naively recover an estimate of r target , which we termr target , from x target through an element-wise (Hadamard) division: However, this approach may erroneously amplify noise in areas where illumination power, |y illum | 2 , is low producing speckle-like artefacts in the recoveredr target . To compensate for these artefacts we apply a probabilistic approach. Using the model derived in Section III F, the noise at pixel (x, y) is considered circularly-symmetric complex Gaussian distributed with standard deviation σ. The noise power at that pixel, termed n(x, y), is therefore drawn from a χ 2 distribution with 1 degree of freedom: Using an empirically measured value for σ we use the χ 2 cumulative distribution function, χ 2 CDF , to estimate for each pixel a probability that it is generated by noise, p n (x, y): We can order p n (x, y) across all (x, y) into a vector denoted as p n . Separating the illumination into amplitude and phase parts as: we then use element-wise division by a probabilistically weighted term as follows: With this approach, pixels with smaller illumination amplitudes tend only to modify the phase of x illum when correcting the speckle-like artefacts during recovery ofr target . Pixels with larger illumination amplitudes, more likely to contain useful information, experience both amplitude and phase correction. In this way, the speckle-like artefacts inr target are reduced. Areas of low illumination will still have poor signal-to-noise ratio but this could be corrected by using a more uniform pre-determined illumination profile or, again, by implementing speckle averaging [32].

A. Implementation of reflector
The switchable reflector structure described in Section II A could be implemented in a number of ways. Reflectors could be switched using a physical mechanism, e.g by translating or rotating a large distal plate relative to the facet. Though this would enable use of the computationally simpler zeroth-order reconstruction approach, it would require actuators and a large plate at the distal facet, adding undesired bulk and compromising the ultra-thin form factor. Here, we propose instead that switching be achieved through modulating wavelength using the concept of a reflector stack ( Figure 4a) with different reflectance behaviour at several closely spaced wavelengths.
The envisaged stack comprises layers of long-pass optical filters so that light penetrates to different depths at different wavelengths ( Figure 4b). Though the first-order reconstruction approach must be used on account of the wavelength modulation, it is still advantageous to satisfy the requirements for the zeroth-order model so that it can be used as a starting point for optimisation and because it is a special case of the first-order model. Therefore, we require that the stack should couple light between different modes of the fibre, including polarisation modes, in a pseudo-random manner to ensure with high probability that the eigenvalues of the TM are distinct as required (Section II B). This could be realised by placing atop each filter a thin (20-50nm) layer comprising spatially heterogenous wire-grid polarisers ( Figure 4c) fabricated, for example, using elctron beam lithography [35]. To act as optical polarisers the fabricated wires must have width and pitch less than the wavelength of light (< λ/2) and so these layers are classed 'metasurfaces'. Spatial heterogeneity would be achieved using arrays of partial polariser cells that have varying linear diattenuation (or dichroism) and diattenuation axis orientations ( Figure 4c).
Because of the optical filters, the reflector stack provides a different PM at each wavelength, R n , n = 1..4. This is seen by considering the different sum of wavefronts that occurs due to reflection and absorption from different layers in Figure 4a. Sweeping a tunable laser effectively switches between reflectors without any actuation at the distal facet. The solid nature of these structures means they can be characterised before use, for example using the process in Appendix A 3 a, and are expected to remain stable throughout operation. The linear nature of this reflector stack design ensures that each PM, R n , n = 1..4, is symmetric due to optical reciprocity [15].

B. Selection of wavelengths
For a given reflector stack the optimal filter centre wavelengths, RM characterisation wavelengths and imaging wavelength must be determined. We assume here that all filters are long-pass but a system using short-pass filters would also be feasible. First, we approximate the long-pass filter transmission function with a sigmoid: where τ max is the maximum power transmission (fixed here as 0.8 based on real component data), α determines the steepness of the filter (measured to be ∼0.73 based on numerous commercially available filters, e.g. ThorLabs FELH0850, which represents a ∼5nm wavelength change for 10% to 90% normalised transmission), and λ f ilt is the centre wavelength of the filter. For the case of 3 RM characterisation wavelengths (λ 1 < λ 2 < λ 3 ), this leads to the following matrix: where λ a < λ b < λ c are the centre wavelengths of the 3 filters with transmission curves described by Equation 32. To maximise the difference between the filters' behaviour at the interrogation wavelengths, we minimise the condition number of F, which gives our objective function, g: where κ(..) represents condition number and .. 2 is the 2 norm. We also want to ensure that at the imaging wavelength λ 4 (> λ 3 ) the reflector stack transmits significantly more light than at the longest characterisation wavelength (λ 3 ). This is essential for efficient illumination and imaging. To achieve this, we define a ratio, β: The full optimisation problem is then: where λ min and λ max are the minimum and maximum range of the tunable laser, ∆λ las is the minimum tuning step of the laser, and ∆λ f ib is the spectral bandwidth of the fibre.
Here, we set λ min and λ max to near-infrared wavelengths of 840nm and 860nm respectively [3]. We set ∆λ las to 0.2nm to reflect the tuning step of commercial tunable lasers. ∆λ f ib is set to be 7nm representing a reasonable spectral bandwidth for a fibre of length ∼2m [31,36].
Finally, we choose β min = 10 as providing acceptable isolation between characterisation and imaging. With the objective and constraints defined, optimisation is performed using a genetic algorithm, implemented in MATLAB 2016b.
For a first proof-of-principle, fibre TMs of size 32×32 are simulated. This corresponds to 16 pixels of spatial resolution in two orthogonal polarisations. The matrices are based on a model of a MCF with large core-to-core coupling. The 16 spatial pixels are considered to represent light guiding cores organised in a rectangular 4×4 grid with unit spacing (arbitrary units) between adjacent elements. The power coupling between a given core at the input and a given core at the output is modelled as decreasing exponentially with the squared lateral distance between them, i.e.: where p α,β is the power coupled between core α at the input (with coordinates (x α , y α )) and core β at the output (with coordinates (x β , y β )). σ 2 f ib is a coupling parameter which we set to be 3. This is a Gaussian power coupling model, which reflects empirical observations of real MCFs [37]. The power coupling profile is duplicated for the second polarisation via a Kronecker product with a 2 × 2 matrix of ones. The complex phase of each element is drawn randomly from a uniform distribution, θ α,β,p,s ∼ U (−π, π) where p = 1, 2, s = 1, 2 indicate the input and output polarisations respectively. This gives a combined expression for each matrix element: The simulated TM, A, is formed of all elements a α,β,p,s in some ordering such that each pair (β, s) defines a unique row index, and each pair (α, p) defines a unique column index.
A is then decomposed using singular value decomposition: where If using the zeroth-order model the simulated matrix is used as the TM at all wavelengths.
If using the first-order model the simulated matrix is used as the TM at wavelength λ 1 , A 1 , and the TMs at the other wavelengths are generated from the matrix logarithm of A 1 (dA from Equation 11) and then applying Equations 15-17 to obtain Although this method of generating matrices simulates a MCF, it could just as easily represent a MMF through an appropriate change of mode basis, e.g. by redefining the spatial pixels to be Laguerre-Gauss modes [2,19]. Simulated TMs are generated using custom code written in MATLAB 2016b.

D. Measured fibre matrices
The proposed methodology is further validated by testing the first-order model and associated recovery using TMs measured from two real fibres. The first is a MCF with TM of size 1648 × 1648 measured at a single wavelength of 850nm in [3]. This TM is used as Next, we present a method of simulating physically realistic reflectors like that shown in Figure 4a. This is achieved by extending the propagation matrix method used for simulating multilayer optical materials (Bragg reflectors or stacks) [39]. The reflector is modelled as a series of layers, as shown in Figure 4a, and the propagation operator used is 2D Fresnel propagation, applied separately to each polarisation. This is in contrast to conventional propagation matrix method in which the propagation operator is the complex exponential propagation operator e −ikz where z is distance and k is wavenumber. The Fresnel propagation operator enables computation of output fields p H and p V (where subscripts H and V denote horizontal and vertical polarisations respectively), resulting from input fields p H and p V propagating a distance l through a layer as: where F is the discrete Fourier transform, n is the refractive index of the layer, λ is the wavelength, k = 2πn λ , ξ x and ξ y are coordinates in the Fourier plane, and constant factors are neglected [40]. This is repeated for p V , giving a 2D complex vector at each point.
Using this modified method, the reflector stack is simulated using three types of layer: pure glass, glass with a wire-grid polariser metasurface on the top, and optical filters. Here, we simulate absorptive optical filters but reflective filters would work equally well because their behaviour in the passband is the same. The filters have transmission curves typical of commercially available components and the centre wavelengths are determined using the process in Section III B.
The stack simulated here comprises a first 1mm thick layer of glass, followed by the 3 filters in succession, each 3mm thick. The metasurfaces, negligibly thin for propagation purposes, are placed between the glass and the first filter, between the first and second filters, and between the second and third filters. The glass and filters have matched refractive indices of 1.52 and for simplicity we neglect the imaginary part of refractive index, i.e. distance-dependent loss. Therefore, the only internal reflections occur at the metasurfaces, reflective filters (if used), and the final glass-air interface. Each metasurface is simulated by randomly generating diattenuating partial-polariser Jones matrices for each sampled point of the field. Each point is assigned a diattenuation angle drawn from a uniform distribution, θ D ∼ U (−π, π), and a diattenuation drawn from a uniform distribution, D ∼ U (0, 1). The transmitted light at each pixel is computed by multiplying the sampled field vector by the relevant Jones matrix: The light reflected is then given by: where α is a parameter representing overall power loss in the metasurface due to plasmonic losses, set to a typical value of 0.8 here [35].
The next step is to determine the overall PMs at the imaging and characterisation wave- This process is repeated at each of the desired wavelengths. The coupling between spatial modes introduced by diffractive propagation between layers and the coupling between polarisations introduced by the metasurfaces produces PMs that with high probability have distinct eigenvalues, which is necessary for approximate zeroth-order TM recovery that precedes first-order TM recovery (Section II B).
The input fields chosen may be derived from the columns of the fibre TM (the modes of the fibre), because any field exiting the fibre must be a linear combination thereof. However, the modes of the fibre may be expressed in some particular basis, e.g. Laguerre-Gauss modes for MMF, and so must first be converted to Cartesian basis so that Fresnel propagation

F. Noise model
We now consider the effect of noise on the presented TM characterisation and imaging system. Noise is inherently present in any realistic implementation of this system, such as that proposed in Appendix A 1, and so must be examined to demonstrate feasibility and robustness.
The main source of noise in such measurements (for example data acquired using the system in Figure 10) is assumed to be electronic noise introduced by the image sensor.
However, there are many processing steps (mostly matrix operations) that transform these raw images into the RMs used for reconstruction [3]. The noise in each image pixel is assumed to be independent, which is consistent with previous fibre imaging work [3,37].
By the central limit theorem, the repeated multiplication and addition of noisy variables via matrix operations will produce circularly symmetric complex zero-mean Gaussian noise.
The relatively strong optical confinement (i.e. wave-guiding) during RM characterisation will result in large detected optical power and therefore approximately Gaussian shot noise.
Laser power fluctuation (relative intensity noise) is not inherently Gaussian, but after time averaging and repeated matrix operations creates a Gaussian contribution to overall noise [41]. Therefore, we model the noise in the system by adding random matrices with independent circularly symmetric complex Gaussian elements to each PM (R 1 , R 2 , R 3 ) and each RM (C 1 , C 2 , C 3 ). For example, a noisy version of R 3 is given by: where each element of N 3 is a complex random variable, n = a + ib, drawn from the distributions: The same distributions, parameterised by σ n , are used to generate noise matricesR 2 , R 3 ,Ĉ 1 ,Ĉ 2 andĈ 3 . We vary σ n over a wide range, including values reflecting empirically observed experimental noise (see Figure 6) [3].
To model noise in image reconstruction we add circularly symmetric complex Gaussian noise to A ref l , R 4 and the measured field at the proximal fibre facet, y total , to get noisy versionÂ ref l ,R 4 andŷ total respectively. Combined with the noisy estimate of the TM at the imaging wavelength,Â 4 , we can then get a noisy estimate of the illuminated target, r target . A noisy estimate of the illumination can then be generated usingÂ 4 andÂ ref l using Equation 25, and we can recover a noisy estimate ofr target using Equation 31. Simulation of noise is implemented using custom code in MATLAB 2016b.

G. Performance metrics
To validate the performance of the presented method, we define several error metrics.
The first is the error between the recovered TM at the imaging wavelength,Ã 4 , and the ideal TM, A 4 , given as: where ||..|| F represents the Frobenius norm. Dividing by the norm of the ideal matrix, A 4 ensures the metric presents normalised (or proportional) error. We then define a second error metric that represents the error introduced when using a recovered matrix to reconstruct an image: where r target is the ideal image. A further metric measures error due to noise in image recovery: wherer target is the estimated recovered image as defined in Equation 30.

H. Recovering TMs
Having produced the necessary instantaneous RMs (C n , n = 1..3), based on real or simulated TMs, the next step is to solve for the TMs using a priori known reflector PMs, In the first-order case, the recovery algorithm depicted in Figure 2 is implemented in the TensorFlow package via Python because of its efficient GPU utilisation and hence superior speed. Initially, we use the Adam stochastic gradient descent optimiser [42], changing to a conventional gradient descent optimiser for fine adjustment in the last few iterations.
The zeroth-order recovery algorithm typically finishes in less than 1s. However, first-order iterative algorithm is significantly slower and can take several hours for 1500 iterations on a 1648 × 1648 matrix on a single Tesla K40 GPU. For smaller 32 × 32 matrices requiring about 500 iterations, the running time is typically < 5 seconds.

A. Simulated TMs
We first randomly generate a single 32 × 32 non-unitary matrix, A 1 , and then A 2 , A 3 , A 4 using the first-order model. We then randomly generate reflector PMs R n with n = 1...4 and use A 1 to simulate RMs C n with n = 1...3 and perform TM recovery using the first-order method (Figure 5a) with element-wise error < 0.05 (Figure 5b). The singular values of this matrix vary from 1 down to 0.03 (Figure 5c), verifying that this recovery can be performed on very non-unitary matrices.
Next, we examine the impact of noise by repeating this recovery process for 64 different noise realisations with increasing noise power (see Section III F). Matrices are generated using either the zeroth-or first-order model and then reconstructed using the respective recovery algorithm. The resultant error in image reconstruction using the recovered TM ( Figure 6a) for the zeroth-order model, in which the recovery is largely analytical, deteriorates rapidly in performance beyond a noise threshold (visualised as a 'hump' in Figure 6a).
In the first-order model, in which recovery relies on an optimisation process, such instability is not observed, thus it is suspected to arise from numerical error accumulation when computing analytical results. The first-order model exhibits improved stability at higher noise levels (such as those encountered experimentally [3]) because the optimisation process, by definition, does not terminate until error (including numerical error) is minimised. At lower noise levels, first-order performance is limited by the number of iterations of the algorithm, which is evident from the impact of increasing the number of iterations from 300 to 500.
Furthermore, testing the effect of 200 different random TM realisations that could result from perturbations of the fibre or manufacturing variations (Figure 6b) shows that after 500 iterations using the first-order recovery algorithm, the error converges towards zero with >50% of matrices having error < 0.1 and 90% having error < 0.4. Negligibly small noise power is used so as to only observe the effect of perturbations.

B. Real MCF TMs
Having established the potential of the first-order model using simulated matrices, we then applied it to an experimentally measured MCF TM [3]. Using the optimisation procedure The non-square TM is first 'downsampled' to a square form of size 1648×1648 (Figure 7a). This downsampling process can be achieved via many methods, for example that detailed in Appendix A 2, and can be reversed at the end of recovery to obtain an estimate for the original non-square TM.
The measured TM is used as A 1 and we apply the first-order model to simulate A 2 , A 3 and A 4 . Next, realistic reflector PMs, R n with n = 1...4, are simulated (see Section III E) and RMs C 1 , C 2 and C 3 computed using Equations 5-7. Using the first-order algorithm we then recover the TM at the imaging wavelength, A 4 (Figure 7a). After 1500 iterations, the maximum element-wise error of the recovered matrix is < 10 −3 (Figure 7b). Because of the dense core spacing (< 4µm) there is significant coupling between cores at the wavelength used, 852nm [3], meaning the MCF begins to exhibit mode-coupling properties more like MMF. The singular values (Figure 7d) show that the matrix is non-unitary but has a low condition number (∼4), again similar to experimental measurements of MMF [19].
Having recovered the TM, we next test the imaging procedure using random illumination.
A 28 × 28 image is simulated and then successfully recovered (Figure 8). The image, r target with reference to Figure  in higher noise (see Section II D) and could thus be improved by averaging over multiple random illumination conditions i.e. speckle averaging [32]. Under conditions of perfectly uniform illumination, we calculate that n,image becomes < 0.002, giving a lower bound on error dominated by other facts e.g. computational limitations.

C. Real multi-wavelength MMF TMs
Using a multi-wavelength fibre TM measured from a 2m piece of step-index MMF [38] to generate RMs, C n , n = 1..3, also led to successful TM,

V. DISCUSSION
The ability to form images through hair-thin optical fibres is currently limited due to the need for in situ TM calibration. To overcome this limitation, we propose a method for recovering the instantaneous TM of a fibre based only on instantaneous reflection-mode measurements and a priori characterisation reflection-and transmission-mode measurements.
Calibration is achieved using three reflectors placed at the distal tip of the fibre, each of which produces a different reflection matrix based on a small modulation in wavelength, from which the imaging TM can be determined. As the fibre TM will differ slightly at each of these wavelengths, we developed a first-order method of modelling and compensating this change using matrix exponentials, which is valid within the spectral bandwidth of the fibre. Under these assumptions, we demonstrate that it is possible to recover TMs and perform imaging using realistic optical components with acceptable noise tolerance for both simulated and measured TMs, including an experimentally measured MCF TM and a multi-wavelength MMF TM.
Nonetheless, several challenges must be overcome for experimental deployment of the proposed method. Given the demonstrated validity of the current model within the spectral bandwidth of the fibre, at 850nm the wavelength modulation would need to be within ∼ 5nm for a typical 1m length of MCF [36] or at 1550nm within ∼ 10nm for a typical 2m length of MMF [31]. In biomedical endoscopy, fibre lengths are typically 1-2m, however, for applications in industrial inspection, greater lengths may be needed. To extend the model to work for smaller bandwidth fibres, e.g. longer fibres or MMFs with very large numbers of modes, either the characterisation bandwidth could be reduced or a more complex ('high order') fibre model used. The characterisation bandwidth is typically limited by the 'sharpness' of available optical filters, which enables significant modulations in reflectance/absorbance over very small wavelength ranges. Custom-fabricated filters may offer improved performance over off-the-shelf products. Alternatively, more complex models of the fibre could be used. Complete propagation models of graded-index MMF have been used to accurately compensate bending [11] but require precise a priori knowledge of the refractive index profile. A major advantage of the approach presented here is that no such prior knowledge of the fibre is required, enabling the use of more complex refractive index profiles such as MCF. It may be possible to develop a more complex model that uses the differential changes in the TM with respect to wavelength to model the fibre over a larger bandwidth for example using machine learning techniques [43,44]. As they do not require a reference model, machine learning techniques may also enable the use of a smaller reflector stack, with only 2 reflectors, though our preliminary investigations have not conclusively shown that this produces a unique solution when using an optimisation process to recover the TM.
Another challenge compared to existing transmission-mode fibre characterisation systems is the introduction of additional characterisation steps, both before and during imaging for instantaneous TM recovery (see Appendix A 3). For example, to recover a single TM will require 3 times as many measurements because of the 3 distinct wavelengths. Speed can be improved by experimental design, for example using high-speed digital micromirror devices instead of liquid-crystal SLMs [45], or by exploiting a priori known sparsity structures of fibres to parallelise spatial measurements [3]. Further, because the characterisation measurements use laser light at distinct wavelengths, they could be captured simultaneously by placing appropriate filters in front of 3 separate image sensors, one for each wavelength.
Computational recovery of TMs and images also presents a speed limitation that could be addressed by using additional GPUs or specialised hardware like an FPGA or ASIC, finetuning the algorithm and solver settings, or only reconstructing a single polarisation (a 4-fold speed increase). If an analytical solution for the first-order case could be found, a dramatic speed increase would result.
A final challenge is fabricating and installing an appropriate reflector stack at the distal tip of the fibre. While optical filters can be purchased off-the shelf, wire-grid polariser metasurfaces most likely require custom fabrication using nanofabrication techniques such as electron-beam lithography. Single-layer devices of this nature have already been demonstrated [35,46], but extending these to make reflector stacks will involve a number of additional deposition, lithography and processing steps. Though the metasurfaces must have specific polarisation transmission and reflection properties, they need not be fabricated to a specific design with high precision because they can be characterised in situ.
Because the method does not place any requirements on the singular value distribution of the TM other than that the TM is invertible (i.e. unitarity is not required), it may also be applicable to more general scattering media. We provide further validation of this by demonstrating recovery of highly non-unitary simulated TMs with a very uneven distribution of singular values, similar to the quarter-circle distribution found in random scattering media [47]. Nonetheless, it would require a precisely characterised reflector to be placed inside the medium, which would require an invasive step in tissue imaging applications.

VI. CONCLUSION
In summary, we have developed a new approach to determine the transmission matrix of a multi-mode optical fibre without requiring access to the distal facet. We have demonstrated successful transmission matrix recovery and imaging using realistic optical components with acceptable noise tolerance for simulated transmission matrices, and those experimentally measured from a multi-core fibre and a multi-wavelength multi-mode fibre. The proposed method paves the way for experimental realisation of lensless imaging through hair-thin optical fibres. Appendix A: Conceptual experimental implementation

Experimental set-up
A conceptual experimental set-up for reflection-mode TM recovery is shown in Figure 10.
There are two key sections: the illumination arm and the detection arm. These perform similar functions to previous work [3] but are both located at the proximal fibre facet.
There is also an additional input that allows characterisation in transmission mode prior to use. Though our previous fibre TM characterisation set-ups have used non-interferometric phase retrieval [3], an optional reference arm for interferometric imaging is included as it is widely used method in fibre characterisation [1]. The tunable laser is required to provide a high-coherence source at the different characterisation and imaging wavelengths. These can typically be tuned in steps of 0.1-0.2nm, sufficient to cover a range of fibre TMs with different spectral bandwidths. Proposed designs of a reflector stack are shown in Figure 4. The plasmonic metasurface reflectors can be fabricated on a glass substrate using electron beam lithography patterning followed by metal deposition to create high-resolution wire-grid polarisers [35]. Fabrication could be scaled-up for commercial applications using, for example, UV lithography or in-terference lithography [48]. The absorptive or reflective long-pass filters used for the stack can be purchased off-the-shelf or made to order with sufficiently small gradations in centre wavelength. Stability of the stack over time is important to ensure the reflector matrices remain relatively constant. This can be achieved by using stable metals such as gold and depositing protective layers, e.g. MgF, on top of metasurfaces.

Downsampling Reflection Matrices
In practical implementations we may measure RMs that are non-square, denotedC n with n = 1...4, due to different sampling schemes for x and y of Figure 1. For example, the number of pixels on the camera may greatly exceed the number of different input calibration fields used to characterise the RM -in our previous work characterising TMs there were ∼ 10 3 times as many pixels as calibration inputs [3].
In this scenario, consider 2M calibration samples (x s ,ỹ s ) 2M s=1 wherex s ∈ C P ×1 andỹ s ∈ C Q×1 . DenotingC n ∈ C Q×P as a mapping of the fibreỹ =C nx (with reference to Figure   1), we wish to determine a downsampling process to obtain a square RM, C ∈ C 2M ×2M that may be used in our recovery algorithms. One way of doing this is to estimate the largest singular values and corresponding left singular vectors ofC n ∈ C Q×P , which should match those of C n ∈ C 2M ×2M . This is done in the following way. Let so thatỸ cal =CX cal . If the coefficients of an input signalx ∈ C P ×1 with respect to the 2M -dimensional calibration basis are denoted byx cal ∈ C 2M ×1 , i.e.,x :=X calxcal , then its output signal is given asỹ =Cx =CX calxcal =Ỹ calxcal . In other words,Ỹ cal is a mapping of the fibre from the 2M -dimensional calibration basis to C Q×1 , and it is hoped that the largest 2M singular values and corresponding left singular vectors ofỸ cal approximate those ofC n and C n . We therefore first truncate the singular values and vectors ofC n to producẽ C n ∈ C Q×2M . Next, we further compress the matrixC n into a square matrix C n ∈ C 2M ×2M which preserves the same singular values and left singular vectors. This can be done via a process such as that described in [37]. Following this process, C n is square and so can be used for TM recovery as described in Sections II B and II C.
This enables a factorisation ofC n into the product of C n ∈ C 2M ×2M and 'downsampling matrix', T n ∈ C P ×2M such thatC n = T n C n . Once the TM has been determined, T n can be used to reconstructC n if desired. The process of downsampling may then be repeated for the other wavelengths, n = 2..4. Figure 11 shows a flow chart for the full operating process, using the set-up of Figure 10, for recovering TMs and performing imaging.

Operating process
a. Characterisation before first use Before using the system for the first time, we must take accurate measurements of the reflector stack RMs at all wavelengths and its TM at the imaging wavelength ( Figure 12).
These are considered to be stable over long periods so this is considered a 'one-off' process. A flowchart detailing this process is given in Figure 12. Non-square measured TMsÃ n and RMsC n with n = 1..4 may, if needed, be downsampled to form square approximations as described in Appendix A 2.

b. Recovering instantaneous transmission matrix
The sequence of experimental measurements required to record instantaneous RMs at each respective wavelength (C 1 , C 2 and C 3 of Equation 5 -Equation 7) is illustrated in Figure 13. Fundamentally, the process involves sending known calibration patterns into the proximal facet using SLM1 of Figure 10, and recording the resultant field after a roundtrip (down the fibre, off the distal reflector, back through the fibre) on the camera in the detection arm of Figure 10. Although the RMs at different wavelengths are here measured sequentially, to speed up operation it may be possible to measure them in parallel, owing to linear nature of all optical elements involved. This could, for example, be achieved by using multiple laser sources multiplexed together and using several image sensors, each with its own optical filter, such that light from any given laser source (having a specific wavelength) is detected Starting from Equation B1 and Equation B2 we derive: which we compactly write as: can then be transformed into: In order to avoid the trivial solution A − A = 0 to Equation B5, R α and C α must not be equal to the identity matrix, which is ensured by choosing different reflectors R 1 and R 2 .
It is important to note from Equation B4 that C α and R α are similar matrices and so have the same eigenvalues. In a similar fashion we can derive from Equations B2 and B3: Finally, from Equations B1 and B3: We begin by using the Schur matrix decomposition [50]. This states that any square matrix, H can be decomposed as: where Q is unitary and U is an upper triangular matrix. Importantly, the diagonal of U comprises the eigenvalues of H sorted in descending order of magnitude. With reference to Equation B5 we can then write:

Substituting Equation B9 and Equation B10 back into Equation B5
gives: and full derivation). This is arranged by suitable design of the reflectors R 1 and R 2 , which is not in general difficult because even randomly generated matrices produce distinct eigenvalues with high probability [51,52].
Equation B11 is another Sylvester equation but, crucially, U Rα and U Cα are upper triangular matrices so we can solve for A element by element as it is typically done in the Bartels-Stewart algorithm. It follows that elements of A are related to elements of U Cα and U Rα as: ur P,Q a Q,Q − Q−1 s=P uc s,Q a P,s + (ur P,P − uc Q,Q )a P,Q + Q−1 q=P +1 ur P,q a q,Q = 0 where uc P,Q represents an element of U Cα at row P and column Q, ur P,Q represents an element of U Rα at row P column Q, and a P,Q represents an element of A at row P and column Q (see Appendix B 1 for full derivation).
Varying  Having determined a suitable basis we express the matrix we wish to recover, A, as: where w m represents the complex-valued weight of the m th basis element, the matrixÃ m .
The solution space now has only 2M degrees of freedom (represented by w m ) so computational complexity can be further reduced by considering a subset of elements of A. We can select B (≥ 2M ) matrix elements of A, either arbitrarily or according to some prior information e.g. the elements with the largest mean based on a statistical model. The corresponding B elements ofÃ m are then ordered into a column vector b m ∈ C B for every m = 1..2M . These column vectors form a matrix B α ∈ C B×2M : so that we can write: where b est is an estimate of the B selected elements of the true TM, A, and w = [w 1 · · · w N ] is a vector containing the complex weights of Equation B14. Since B α is either a square or tall matrix, we pre-multiply by its Moore-Penrose pseudo inverse, B † α : We then multiply both sides by B α to get: and substitute in Equation B16 to obtain a recursive expression: Following the same steps to derive Equation B17 from Equation B5 but starting from Similarly, starting from Equation B7 gives: Clearly, b est is an eigenvector of B α B † α , B β B † β and B γ B † γ with an eigenvalue of 1. Physical considerations of power conservation suggest the existence of at least one non-trivial solution for b est . Our empirical investigations with both simulated and real TMs (see Section IV) suggest that when using at least two out of Equations B17, B18 and B19 we can identify a single non-trivial solution, the 'true' solution: we arbitrarily choose Equations B17 and B18.
Next we apply a variant of the 'power method' for finding dominant eigenvalues of a matrix [53] and recursively substitute Equation B17 into Equation B18, substitute the result into Equation B17 etc. to give: where t denotes the number of matrix multiplications. For large t, the output b t est is expected to converge to the desired solution b est for any input b 0 est . Alternatively, for large t, the dominant eigenvector of B α B † α B β B † β . . . B α B † α will approximate the solution well. In this work, we find empirically that 2-3 iterations, i.e. t ∈ {4, 6}, is sufficient for faithful TM recovery.
To obtain the desired full TM, A, we determine the weights, w, from approximation b t est using Equation B16 and then compute the appropriate weighted sum of A m with m = 1..2M using Equation B14.
An alternative to such a method could be to directly optimise the weights of Equation B14 to best satisfy Equations B1-B3, or to minimise a prior probability distribution placed on the matrix (e.g. an empirically determined sparsity structure such as that presented in [37]).

Sylvester equations element by element using Bartels-Stewart algorithm
The Bartels-Stewart algorithm is a method of solving matrix equations of the form DX − XE = F (where D, E, F ∈ C N ×N ), termed Sylvester equations, to find the matrix X ∈ C N ×N [49]. Here, we wish to solve the special case where F = 0. We begin with the Schur decompositions of the matrices R α and C α (adapted from Equations B9 and B10): where Q R and Q R is unitary and U R and U C are upper triangular matrices. Importantly, the diagonal of U C comprises the eigenvalues of C sorted in descending order of magnitude, and likewise for U R . The Schur decomposition of a matrix with entries sorted in this way can be computed, for example, in MATLAB. This enables us to write the following Sylvester equation: where A is linked to the original transmission matrix, A by: Recalling that all matrices here are ∈ C N ×N , we first consider the element in row N , column 1, i.e. (N, 1), of the zero matrix on the RHS of Equation B23. We see that: where ur m,n is the element in row m, column n of U R , uc m,n is the element in row m, column n of U C , and a m,n is the element in row m, column n of A .
Next, we require that the eigenvalues of R are distinct, which is ensured by suitable design of reflectors (see Sections II A and III A). We also know that the eigenvalues of R are equal to those of C α because the two are similar matrices (see, for example, Equation B4).
Therefore, the values on the diagonal of U R are distinct from one another, and are equal to the values on the diagonal of U C . The only way that Equation B25 can hold is if a N,1 = 0, providing the solution for that element.

Appendix C: Orthogonal basis generation
To achieve maximum signal-to-noise ratio during optimisation or any other solution method, the basis generated for reconstructing the TM, A in Section II B, should ideally be orthogonal. This basis is constructed starting from an arbitrary vector ∈ C N used for the diagonal elements ( Figure 14). In general, this will produce a complete but non-orthogonal basis. Elements of this basis may be very similar resulting in numerical instabilities during the zeroth-order recovery process.
To produce an orthogonal basis, we first define a function that gives the solution of the system of equations defined in Equation B13. This function, which could be implemented by any linear equation solver, takes as input a vector representing diagonal elements of the solution matrix, u, and returns the full matrix, B: We can then start with an arbitrary set of complex diagonal elements, u 1 ∈ C N , to get a basis element for A (defined in Equation B24), which we term B 1 = f (u 1 ). Next, we invert Equation B24 to find the associated basis element for A, given by B 1 = Q R B 1 Q −1 C . To get the next basis element we then generate a random complex matrix, B 2 ∈ C N ×N and perform a Gram-Schmidt orthogonalisation step: vec(B 2 ) = vec(B 2 ) − proj vec(B 1 ) vec(B 2 ) (C2) where vec(..) represents ordering of matrix elements into a vector. We then use Equation B24 to convert back to an estimated basis element for A : and find the diagonal elements of this basis element: We then use these diagonal elements to generate a new matrix that is a valid solution to B13 (i.e. a basis element for A ): and convert back to a basis element for A: We then substitute B 2 back into Equation C2 and iterate several times. The result of this will be a basis matrix, B 2 , that is orthogonal to B 1 . This process is then repeated for subsequent basis elements using all previously generated basis elements. For example, the At the longest wavelength, λ 0 + ∆λ/2, the phase shift introduced by the shortest path is: and by the longest path: ∆φ 1 − ∆φ 2 = 2π λ 0 + ∆λ/2 (λ 0 + ∆λ/2 − λ 0 + ∆λ/2) + λ 0 − ∆λ/2 (λ 0 − ∆λ/2 − λ 0 − ∆λ/2) (E20) ∆φ 1 − ∆φ 2 = 2π λ 0 + ∆λ/2 ∆λ This then shows that if the wavelength is kept within the spectral bandwidth, the maximum additional phase shift introduced between two paths will be 2π, validating the assumption made when taking the matrix logarithm in Section II C.
Intuitively, if path lengths vary in phase by 2π when the wavelength is varied, the diffracted far-field pattern will remain similar and so the 'speckle correlation' will be high.
When the path length differs by > 2π, variations in wavelengths produce quite different diffraction patterns (as the phases will be significantly altered) and thus the speckle patterns will become decorrelated. This is described in more detail in [29].

Appendix F: Bandwidth of step-index MMF
To validate the first-order recovery algorithm, we use a dataset consisting of TMs taken at multiple wavelengths (1525-1567nm in steps of 0.08nm) [38]. The TMs are taken from a step-index refractive index profile MMF with 420 modes (including polarisation modes We then compute an error metric to test the predictive accuracy of the first-order model: Setting λ ref to 1526nm, we can then plot est as a function of ∆λ ( Figure 15). We find empirically that changing λ ref does not significantly change this curve.
The full 420 × 420 matrix provides est = 0.5 for a ∆λ of ∼0.5nm. This is insufficient to to enable the design of a reflector stack like that in Section III A using off-the-shelf optical filters because the spacings between filter centre wavelengths would need to be < 0.1nm. To improve bandwidth, we restrict analysis to the lowest order modes of the fibre and form an N × N sub-matrix comprising the first N columns of the first N rows of each A MMF . The resulting est curves for N = 30, 72, 110, 132, 156, 210, 420 are plotted in Figure 15. We find that using the 110 lowest-order modes gives a ∆λ of 5nm for est = 0.5. This is ample to allow design of a realistic filter stack in line with the principles of Section III A and III B and so is used here.
In reality dispersion will further change the effective optical thicknesses at the different wavelengths. From the dataset here, we compute that over a 5nm range the error introduced is <3% and so is therefore neglected in this work.