Emergence of biased errors in imperfect photonic circuits

We study the impact of experimental imperfections in integrated photonic circuits. We discuss the emergence of a moderate biased error in path encoding, and investigate its correlation with properties of the optical paths. Our analysis connects and deepens previous studies in this direction, revealing potential issues for high-precision tests and optical implementations of machine learning.


I. INTRODUCTION
Photonic integrated technologies are a key resource for classical and quantum applications 1,2 , also including machine learning and artificial intelligence 3,4 . Nevertheless, the unavoidable effect of imperfections due to fabrication, calibration and control can severely undermine their performance 5,6 . With path encoding, for instance, a systematic deviation in the scattering probabilities can lead to a biased error in (un)supervised learning [7][8][9][10][11][12][13] or reinforcement learning 14,15 . While major achievements have recently been reported for photonic circuits [16][17][18] , error correction 19 and model training 10,20,21 , the inevitability of such imperfections motivates us to shed more light on the above issue.
In this work, we will consider two canonical unitary decompositions 22,23 and discuss potential vulnerabilities in realistic settings. In principle, these analyses can be readily extended also to other optical architectures. We reveal and quantify a mild systematic bias in their operation, which originates from a biased error in the matrix elements of the unitary transformation. We also show how this feature connects to previously known results, and consider a graph-based approach to tackle similar analyses on noise spreading in an optical mesh. It is not the aim of this work to propose a unique strategy to mitigate these effects, as done very nicely in related works for similar problems 19,[24][25][26] . We aim to understand these issues and bring them closer to the attention in the field, since practical countermeasures will mostly depend on the actual application and physical constraints.

II. FRAMEWORK
In this section, we will first review related results in the context of circuit optimization and characterization, to highlight the main connections and novel contributions. Then, we will introduce the main quantities and models that will be used in our analyses.

A. Related works
The influence of imperfections in photonic circuits has already been investigated in various works 23,[25][26][27] . Here we deepen these studies with a focus on aspects that, to the best of our knowledge, have so far eluded a detailed investigation. Specifically, we study how a biased error can emerge in a noisy circuit, addressing precise, practical questions that concern future applications.
In realistic settings, it is paramount to improve the operation of imperfect components. Pioneering work in this scope was done by Miller, who proposed solutions for automatic alignment, configuration and compensation 28 . Recent works have focused on various other approaches, including numerical techniques 25,29,30 , port allocation and compilation 31 , and architecture refinements 25,[32][33][34] . Our analysis addresses issues related to error propagation in photonic circuits, so that they can be better addressed by the above strategies.
More closely related to this work, there is a series of studies [24][25][26] on the properties of phases and transmissivities in triangular 22 ( ) and rectangular 23 ( ) meshes. Ref. 24 presents a strategy to dial up Haar-random unitary transformations by sampling transmissivities and phases from independent probability density functions. Besides its use in practical applications, this study clarifies how these parameters are distributed, which may affect their noise resilience in a non-homogeneous way. Similar considerations have been put forward in Ref. 25 , showing that, in larger circuits, the distribution of reflectivities is increasingly skewed towards low values. Given that a non-ideal mesh may not achieve these low values, the overall fidelity of these circuits would be limited. More recently, notions of sensitivities and error tolerances of optical components have been expanded in Ref. 26 , which provides a rich analytic and numerical exploration of these aspects. Our analysis extends the study of unbalanced error tolerance 26 from a different and fine-grained perspective, establishing connections to previous works 24,25 . Whenever applicable, we will make these connections explicit.
The above works focus on theoretical properties of linear optical circuits and error mitigation. However, nonideal transformations have been object of detailed analyses also in the context of Boson Sampling (see Refs. [35][36][37] and references within), to investigate whether imperfect optical devices preserve classical computational hardness. In this case, the figures of merit are usually global properties of the output probability distribution (e.g. a distance from the ideal distribution). Conversely, the present analysis abstracts from Boson Sampling and targets individual components of the distributions. We will return to this point in Sec. IV B and VII A.
In Sec. III, we start by introducing the graph-based approach that underlies many of our discussions. In Sec. IV we discuss the core of this work, namely the emergence of systematic deviations in optical meshes subject to uniformly distributed noise. Sec. V addresses similar considerations for the characterization of reconfigurable circuits. In the Appendix we consolidate the above ideas by proposing more detailed analyses.

B. Universal photonic circuits
Any m-dimensional unitary transformation U can be implemented by an integrated photonic circuit using only meshes of beamsplitters and phase shifters (Fig. 1a). Here we focus on two canonical architectures, the 22 and 23 meshes, although other approaches exist in the classical and quantum domains 1,2,38-40 . These architectures consist of optical meshes of Mach-Zehnder interferometers (MZ), arranged in the regular geometry described by their name. An ideal MZ consists of a pair of 50:50 beamsplitters interleaved with a pair of phase shifters (ψ, θ) (Fig. 1b), whose action on the two input modes is described by the transformation There exist also equivalent formulations 28 where the first phase shifter lays on the output mode or in the lower arm of the MZ. With this notation, when ψ = 0 the MZ is in the cross state (all light is transmitted), while ψ = π gives the bar state (all light is reflected). Experimental imperfections are usually modelled by adding Gaussian noise to the ideal parameters. For beamsplitters, this corresponds to sampling a parameter around the ideal value of the transmission amplitude (1/ √ 2), with a width given by the state of the art.
Any m × m U can be decomposed 22,23 in the product of a diagonal matrix D (with complex unit magnitude elements) and m(m − 1)/2 unitaries U k implementing m-mode Givens rotations on the planes spanned by two adjacent optical modes where the u ij (i, j ∈ [1,2]) are the matrix elements of a MZ (see Eq. 1). In the following sections, for many of our analyses we will also represent optical circuits as graphs (see Fig. 1b,c and the next section).

III. PHOTONIC CIRCUITS AS GRAPHS
A convenient way to investigate the effect of imperfections in photonic circuits is to represent them as directed graphs. For and topologies, such graphs are also acyclic: light enters from a subset of the m input nodes, propagates from left to right and leaves the graph from the m output nodes. This approach, recently discussed in a different context in other works 34,41 , facilitates the study of error propagation. In this section we make this intuition more concrete.
A graph representation for the canonical architectures is shown in Fig. 2a ( ) and 2b,c ( ). This approach inspires a number of questions about symmetry: are there more influential nodes contributing to error spreading? Similar questions have already drawn large attention in areas related to information dissemination, contagious disease transmission and social influence 42 . In these cases, nodes with a high centrality index (see Sec. VII B) are known to facilitate spreading, even though this depends on many factors. Motivated by this connection, we propose to leverage ideas from interaction networks to study errors in optical networks.
A notion of sensitivity has been introduced for MZs that depends on their position in the mesh 34 . This quantity is linked to the shape of the distributions induced by the Haar measure over the physical parameters 24,25 . Here we show how consistent considerations can be made from a different perspective, by considering the importance of each MZ (node n) in the circuit (graph G). We investigate two aspects: centrality measures (discussed in Sec. and architectures can be represented as planar graphs to study error propagation. Here, white nodes represent m = 12 input/output modes, colored nodes represent unit cells (two phase shifters each), while edges represent waveguide connections. Different measures of node centrality can be studied and visualized with this approach (see Sec. VII B). a) topology. Colors represent the value of the flow Φ for each node (see Eq. 4), normalized by the maximum. Lighter nodes have a higher value, as in panels (d,e). b) topology, same color scheme as panel (a). c) Sensitivity measure proposed in Ref. 34 . d,e) Data from panels (b,c) projected onto a bar chart (using the same ordering, layer by layer).
VII B) and properties related to paths in a graph. The latter approach is convenient to study error propagation in waveguide structures and, assuming a homogeneous level of noise over all optical elements, to estimate the amount accumulated in the circuit. If we focus on a node, it is reasonable to expect that its influence increases the more connected it is to the input and output modes. We can even focus on subsets of input/output modes, or associate different weights with different paths. For instance, we can expect a node to be more influential if paths are on average longer, since losses and noise increase with the number of elements. To make it more quantitative, we introduce the following heuristic figure of merit, which we simply call f low, I n (O n ) is the set of input (output) nodes in G reaching (reachable by) node n, {π a,b } is the set of all paths π a,b from node a to b, and |.| is their length, for some k > 0. For k = 1, the flow is the geometric average of the mean lengths of all paths connecting input and output to node n. The intuition is that the higher Φ G (n), the more n is likely to be critical in a noisy setting. Results for k = 1 are shown in Fig. 2 (k = 0.5 or k = 2 behave qualitatively similarly). We find that Φ G (n) is consistent with the sensitivity index α G (n) = |I n | + |O n | + m − 1 introduced in Ref. 34 , since they correlate well for both and (Fig. 2b,c) and since, for a given m, they both depend only on I n and O n . The main difference is that, while α G (n) has a solid theoretical support that is lacking in the present discussion, Φ G (n) can in principle be applied to circuits with arbitrary topologies. Also, while α G (n) is an effective and compact tool to study Haarrandom unitaries, Φ G (n) is in principle compatible with applications where unitaries are not drawn according to this measure. It would be exciting to design an estimator that generalizes to arbitrary topologies and that is more grounded in the theory of interferometric design.
The above analysis focused on individual nodes n inside the circuit. When n is an output node, the analysis concerns the connection between optical paths and the matrix elements u ij that describe a scattering process. Specifically, we can count the number of paths π i,j that connect any pair (i, j) of input and output modes. This information allows to reveal the imbalance in the number of tunable elements that influence all u ij , within the same topology and for different topologies (see also Sec. IV B). We find that this number is symmetric about the main diagonal (i = j), due to the vertical symmetry of the mesh, while is almost symmetric about both. As expected, we also find that u ij in present a strong imbalance w.r.t. paths, while the architecture is significantly more balanced. Our aim is then to quantify this intuitive discrepancy, which is one of the reasons why is usually considered superior. However, the relative ease of a brute-force approach (convenient to explore arbitrary, small circuits up to m ∼ 30) comes at the expense of a computationally intensive evaluation, given the ex-ponentially large number of paths. Specific tricks (e.g. leveraging symmetries or propagation in light cones) can only mildly extend this range. Hence, we looked for a more efficient way to estimate the number of paths that contribute to the scattering amplitude u ij . We find that an exact, analytical solution can be obtained using a special integer sequence given by the Catalan's trapezoid 43 . We discuss its derivation in Sec. VII A, with further considerations on the asymptotics. We numerically verify its prediction by comparing it to the graph-based approach, up to the largest size that could be probed on a workstation. Fig. 3 shows an example for m = 50 (no new pattern emerged for larger m).
In general, the formula presented in Sec. VII A -and estimators analogous to Eq. 4 -could be used to quantify the importance of a node for a transition, as well as the asymmetry between transitions. Conversely, the graphbased approach is flexible, can inspire the design of new estimators (see Fig. 4) and supports arbitrary topologies, for which a closed-form expression is not available. Together, these considerations set the stage to discuss how biased errors emerge in single-and multi-photon dynamics in linear optical circuits.

IV. UNIFORM NOISE AND NONUNIFORM DEVIATIONS IN A SCATTERING PROCESS
Given the difference between and meshes shown in Fig. 3 and 4, we ask whether this is reflected in the noise sensitivity of the elements u ij of a unitary implemented with these schemes. This analysis aims to reveal a potential bias that has eluded studies based on coarse-grained measures, such as the total variation distance or the fidelity. Indeed, while these measures provide a simple and useful summary of the overall quality of an implementation, they may not reflect the needs of a practical application. For instance, we do not want a classifier with a different accuracy for different objects (or we would like to compensate for this asymmetry). Also, we expect that high-precision applications, e.g. tests in quantum foundations 44 , would benefit from a fine-grained understanding of the noise resilience of the scattering matrix.

A. Single-photon evolution
To investigate this aspect, we quantify the average deviation of the noisy elementũ ij when Gaussian noise is added in the architecture (i.e. not directly to u ij ). The research question we address here is whether the sensitivity of u ij correlates with the patterns found in Fig. 3  The noise sensitivity of uij reflects patterns in the input-output connectivity (see Fig. 3 and 4), but more rigorous analyses are necessary for a full understanding of this correlation. The overall patterns given by ζij(U ) in Eq. 7 depend on how parameters are sampled (top row: meshes; bottom row: meshes). For instance, patterns for Haar-random unitaries are due to the distribution of phases in the circuit 25 . We study this sensitivity by numerically generating 2 × 10 4 50-mode unitaries U , for 2 × 10 3 random initializations of imperfect components each. Tests are carried out using uniformly sampled phases (a, d), Haar-random unitaries (b, e), and the Fourier transform (c, f). Imperfections are modelled with a mild Gaussian noise on the amplitude transmission of the 50:50 beamsplitters (µ = 2 −1/2 , σ = 10 −2 ) and on the phase shifters (σ = 10 −3 ). In all panels, brighter colors correspond to elements that deviate the most from the ideal value. and 4. A positive answer would suggest an explanation for this phenomenon, which is the first step to counteract it. To this end, we consider the following quantity which is nothing more than an adaptation of the interferometric visibility, for some suitable complex function f . Choosing f 1 (z) = Re(z) and f 2 (x) = Im(z) and considering several random imperfect implementations for each U , we can compute the estimator where γ 1,2 normalize the two terms by their maximum, for both architectures and different unitaries. We consider two ways to sample U : according to the Haar measure (uniform over the unitaries, not in the phase space 25 ) and by uniformly sampling the tunable phases in the circuit (not uniform over the unitaries). Results reveal a clear imbalance in the sensitivity to a homogeneous Gaussian noise among the matrix elements of U (see Fig. 5).
In the case of uniformly sampled phases (Fig. 5a,d), we find that the sensitivity of u ij (vaguely) negatively correlates with the number of optical paths: the higher the connectivity is (see Fig. 3 and 4), the lower the deviation tends to be as measured by ζ ij . However, the pattern that emerges in Fig. 5a suggests that a thorough explanation depends also on other factors. From a practical standpoint, this imbalance prompts us to take it into account when unitaries are not sampled from the Haar measure, e.g. when training machine learning models.
In the case of Haar-random unitaries (Fig. 5b,e), a positive correlation with properties of the optical paths seems to apply (almost opposite to the previous case). We attribute this pattern to the different distribution of phases in the circuit 25 , which was already discussed in Fig. 2. In particular, we see that the u ij that exhibit a large average deviation from the ideal value are correlated with those with a large number of paths in the mesh. However, here too patterns are more complex and call for more refined, formal analyses to retrace the actual cause. Specifically, it would be interesting to understand why a slightly brighter region appears in the bottomright corner of the image for (or a darker region in the center for ).
The above discussion, which involves averages over sev-eral unitaries and random initializations, does not say much about specific transformations. Hence, we applied the same analysis to one illustrative example, the Fourier transform, to see whether this is in any way reflected in a single unitary (Fig. 5c,f). We observe that, despite its symmetry, the sensitivity has the same pattern as the average Haar-random unitary (Fig. 5b,e).
Overall, this shows that the fidelity is not always a suitable metric for high-precision, path-encoded applications. Interestingly, the above patterns survive if we separately plot the two terms in Eq. 7. This means that, for example, multi-photon dynamics would handle noise in a different way, with a marked discrepancy between the most and the least resilient states. This observation will be the object of the next sections.

B. Multi-photon evolution
So far, we discussed how noise affects the matrix elements of U , which correspond to single-photon scattering amplitudes. Here we sharpen this analysis by studying how Gaussian noise, applied uniformly on the tunable phases, influences multi-photon transitions. We emphasize that this research question is different from previous analyses in the literature, which looked at global properties of the noisy probability distribution. While this issue may be not too relevant for Boson Sampling, it can be relevant for Scattershot-like applications 45 and for pathencoded implementations of machine learning 7-9,12-15 , where bias plays a major role. One possible (optimistic) answer we considered was that noise would be almost uniformly distributed on the output, with deviations mildly modulated by properties of U . The complementary hypothesis (which turned out to be the case) was that systematic deviations would be found, dependent on U or on the specific transition. In the following we will describe the analysis on a more technical level.
Let us first introduce some notation. We consider the exponentially large set of input ({v}) and output ({w}) N -photon states (fully indistinguishable photons, without collisions) in an m-mode interferometer, and denote it with Ω m N . We can associate with each pair (v,w) a transition probability p vw , given by the permanent of a submatrix of U defined by the Fock representation of v and w. Looking at the parametric representation of U , we have a set of parameters {ξ} = {ψ} ∪ {θ} that describes the action of all phase shifters. For each pair (v,w), we can also retrieve the subset {ξ} vw that influences p vw . Now, we want to understand whether the signed deviation of a noisyp vw is on average practically null, or whether there is a systematic bias and, in this case, to what we can attribute it. To this end, we define such deviation as δ (r) vw :=p (r) vw − p vw , where r enumerates random noisy initializations of U , and compute the average δ vw ∝ r δ (r) vw ∝ p vw −p vw with standard deviation σ vw . We then evaluate for all parameters ξ k the quantity where |{.}| is the number of elements in the set, k = 1, 2, . . . , m(m − 1) and we used the Iverson bracket Incidentally, δ vw /σ vw is the inverse of the coefficient of variation of the single deviations that produce δ vw . The intuition behind this quantity, and more generally about ∆ ξ k , is that we are interested in the significance of the deviation δ vw from 0 (i.e. how many σ vw is δ vw far from 0). If there is a relevant systematic bias in the probability distribution, we should be able to observe a preferred (positive or negative) direction for some p vw when we consider all combinations. This real-valued measure of significance is then assigned to each parameter ξ k that contributed to its magnitude via the Iverson bracket, in proportion to the number of parameters that were involved in that process (|{ξ} vw | −1 ). We are now ready to outline how the analysis proceeds.
The analysis consists of three stages. (i) [Adding noise] We pick a Haar-random U and uniformly apply Gaussian noise on all tunable phases (i.e. not on the u ij ). The analysis was repeated for several U and different sizes m, finding a consistent behaviour. (ii) [Fromũ ij to dynamics] We numerically generate noisy N -photon probability distributions (N = 2, 3, 4) and compute ∆ ξ k using Eq. 8 (see Fig. 6a). (iii) [From dynamics back toũ ij ] We investigate a possible connection between statistically significant deviations in the probability distribution, tunable phases and u ij (Fig. 6b,c).
An example for this analysis is shown in Fig. 6a for N = 2, m = 12. For a fixed input state v , we find no obvious correlation between the output states and the magnitude of the bias δ v w σ −1 v w : sometimes the bias is large when p v w is large, sometimes the converse is true. However, when we consider all input-output combinations as in Eq. 8 and ask how bias manifests itself in U , we retrieve patterns similar to those in Fig. 3 and 4this time from yet another mechanism. To see this, in Fig. 6c we plot the sum of ∆ ξ k over all parameters ξ k that influence u ij , namely This connection seems to suggest that a biased error in multi-photon dynamics is correlated with properties of the paths. However, we observe that the pattern in Fig.  6c matches the distribution of the number of tunable parameters that influence each u ij (Fig. 6d), providing stronger insights into the underlying error propagation. We emphasize that here we are not simply recombining in u ij noise that was added to each ξ k that influences u ij : in that case, the outcome would be trivial. Instead,  Fig. 5. b) ∆ ξ k for all nodes in the and meshes. This pattern is partly due to the fact that cells closer to the output are reached by more input states. c) ∆ij normalized to the maximum, for all matrix elements uij. We recover patterns similar to Fig. 3

and 4 for both
(left) and (right) meshes. d) Number of tunable parameters that influence each uij, normalized to the maximum (see also Fig. 9). This pattern matches the one in panel (c).
we are first mixing the effect of noise in a quantum process (photons do not evolve independently in the circuit), then we are recombining the (inverse of the) coefficient of variation, which is related to a biased error.

V. BIAS IN THE CALIBRATION PROCESS
Understanding how a circuit operates is a necessary step for practical applications. With tunable circuits, it is especially necessary to calibrate the voltage-to-phase relationship for each thermo-optic phase shifter in the device. In this manuscript, we will use θ ∼ θ 0 + α V 2 and ψ ∼ ψ 0 + β V 2 for phase shifters inside and before the MZ, respectively (see Fig. 1b). During a calibration, the goal is then to estimate (θ 0 , α, ψ 0 , β) for each pair of phase shifters in each MZ. While estimating the transmissivity of directional couplers is also important, here we consider them ideal to isolate the contribution related to the phase.
Bayesian estimation and machine learning are promising approaches to calibrate a device without detailed information on the internal structure (see e.g. Refs. 46,47 and references within). In the case of and meshes, instead, calibration can be carried out by directing light to each MZ in a specific order. A strategy to calibrate the mesh using knowledge about the structure was briefly sketched in Ref. [17][18][19]48,49 . We propose an algorithm in Sec. VII C, which we use for our discussion on errors (see Fig. 7). In this scheme, parameters are estimated in two main steps: (i) all MZs are probed in sequence (Fig. 7a), returning an estimate for (θ 0 , α); (ii) using  stages (a,b). a) Light is injected in first input mode and measured in the last output, to characterize the diagonal A. This step is repeated sequentially for the other diagonals. b) Phase shifters outside the MZs are calibrated by constructing larger MZs consisting of two MZs in the 50:50 state (fuchsia) interleaved with and followed by MZs in the bar state (blue). c) The order chosen by the calibration induces a pattern in the noise distribution. In this example, colors describe the mean deviation of the estimated ψ0 (left) and β (right) from the actual values (bright: high; dark: low), averaged over 10 3 calibrations of Haar-random unitaries.
this information, all MZs are adjusted to either the bar or the 50:50 state (Fig. 7b) to estimate (ψ 0 , β). After this step, additional tricks can be used to improve the estimates. Our main observation, here, is that there is a systematic bias in the uncertainty and accuracy associated with every parameter. This bias is mainly induced by the sequential nature of the calibration, since estimating the latter parameters requires (imperfect) knowledge of several others. In Fig. 7c, for instance, we show how the average error associated to each MZ correlates with its position in the optical mesh, which in turn influences the sequential procedure of the calibration. This issue can become relevant in large-size circuits.

VI. DISCUSSION
Classical and quantum optical technologies provide a promising platform to enable faster and energy-efficient machine learning. Among these technologies, integrated circuits represent a well-established resource with an intuitive operation, good control and advanced miniaturization. In this context, while path-encoding is generally a convenient approach to process information, we argue that it is also necessary to counteract the path-dependent noise hidden in the process. This issue manifests itself in an unbalanced control, both in accuracy and precision, over the map between input and output states, which is potentially relevant in tasks such as classification, reinforcement learning or tests on the foundations of physics.
In this work, we shed light on this aspect by considering single-photon and multi-photon evolutions in triangular and rectangular architectures. We numerically simulate these dynamics in realistic settings, collecting evidence that the impact of noise correlates with properties of the optical paths, as well as with the number of tunable elements they involve. We then present a conjecture for an exact, closed-form expression to compute the number of paths for all input-output combinations and arbitrary size. The validity of this conjecture was verified up to a large circuit size using a graph representation, which we consider of independent interest for several reasons. In fact, we envision the development of techniques to estimate the relevance of nodes in an optical mesh, as done in other fields such as social network analysis. We make a first step also in this direction, reporting on preliminary investigations to identify the most influential regions for error propagation. While some of these tests are inconclusive, we believe that more can be understood by looking carefully with these techniques. This can be especially true for large circuits with arbitrary topology, where special sets of nodes can act as bottlenecks or take on a more influential role for a given application.
Code for all the analyses presented in this manuscript is publicly available online. We conclude with extra remarks on the applicability of this work (see Table I). On the one hand, the positive side is that the imbalance revealed by these analyses seems to be of small magnitude. This means that the above concerns are practically not relevant for most applications today. We leave an absolute estimate of its impact to more specialized studies, which focus on individual technologies and implementations. On the other hand, as the circuits size and the desired accuracy increase, it is reasonable to expect that these concerns will become more relevant in the future. In the highly plausible scenario where optical technologies become a go-to platform for machine learning, similar analyses would also be useful to build trust with the end consumers and funding agencies. We hope this research will help make progress also in this sense.  . It reflects only the author's view, the EU Agency is not responsible for any use that may be made of the information it contains. The project has received partial support from the Austrian Science Fund (FWF) through the project SFB BeyondC F71.

VII. APPENDIX
In the following sections, we study three aspects that were only briefly mentioned in the main text. Sec. VII A presents an exact and closed-form expression to compute the number of optical paths inside the canonical meshes. This information should facilitate the development of measures to counteract the influence of biased errors. Sec. VII B presents considerations on a strategy to estimate the influence of a node on error propagation. Finally, Sec. VII C completes the discussion started in Sec. V, presenting a pseudocode to calibrate the mesh. This can be useful in practical applications, as well as to understand how a bias originates during the process.

A. Counting optical paths
Here we present analytical expressions to compute the exact number of paths that light can take from mode i to j in an m-mode and optical meshes. We denote this set of paths as {π m ij } and its cardinality as |{π m ij }|. This information is relevant to estimate the susceptibility of each input-output transition to circuit imperfections. We also present analytical expressions for the exact number of paths that connect one input MZ (i.e. a MZ in the first vertical layer) to any MZ inside the architecture. This quantity can be used in conjunction with the flow of a MZ (see Eq. 4) to assess its criticality in a large mesh. Further work is necessary to connect this result to applications with specific requirements.

Paths from input to output
For a fixed mesh, we expect |{π m ij }| to depend on the horizontal layers to which i and j belong (modes in the same input/output MZ have the same number of paths) and on the parity of m, since has horizontal symmetry when m is odd. Below we present an analytic expression that indeed fulfills these observations. Given its formulation, we approach the problem using combinatorics. We find by inspection that a solution can be expressed using Catalan's trapezoids 43 C s (a, b), a generalization of the Catalan's triangle (in turn generalizing the Catalan numbers, defined using binomial coefficients). A closed-form expression for C s (a, b) is 1, 2, 3, . . . , a = 0, 1, 2, . . . and b = 0, 1, 2, . . . . We find that, by introducing the compact notation σ m ij := indeed provides the answer to our counting problem Recalling that Eq. 11 is a piecewise function over three regions, we observe that the last condition is always satisfied for all values given by Eq. 13, due to the constraints of the problem (1 ≤ i, j ≤ m). From a physical perspective this makes sense, since |{π m ij }| cannot be 0. We do not know why Eq. 14 is true, but we numerically verified its correctness for all i, j ∈ [1, m] and m ∈ [1,27]. We leave the pleasure of the proof to the curious and more By a similar counting argument, choosing we also find the expression for the mesh Notice that, due to the symmetry rule for binomial coefficients, Eq. 16 is symmetric in i and j, as expected.
We emphasize that |{π m ij }| and |{π m ij }| provide the exact number of paths between modes (i, j). Using these quantities, we can quantify the asymmetry between any single-photon transitions. We can also combine this information to study multi-photon processes with distinguishable photons (since they evolve independently), while correlations may be found in dynamics that involve quantum interference. Indeed, we recall that each u ij depends only on a subset of tunable parameters, related to the waveguide connectivity in the circuit (see Fig. 9).
We can also use Eq. 14 and 16 to quantify the asymmetry between architectures w.r.t the number of paths that connect two modes. This information can be useful since |{π m ij }| correlates with the resilience to noise, which can be very different in the two cases. For instance, for m large and i = j = 1, using the Stirling approximation for the binomial coefficient we obtain the scaling which shows an exponential separation between the two architectures (see also Fig. 3). Similarly, when i = 1 and j = m 2 we can use which works when both a and b are large, and Γ( with w = log 3/8 + 5/4 log 3 ∼ 0.39. We tested the approximations in Eq. 17 and 19 in the regime 30 < m < 300, finding a very good agreement with the exact value given by Eq. 14 and 16. Together, they provide quantitative understanding of the asymmetry between the two architectures. This knowledge can guide the choice of the most appropriate platform for a given application that needs to fulfill specific physical constraints, as well as to correct for this inherent imbalance.

Paths from a Mach-Zehnder to input and output
Here we discuss how to compute the number of paths that connect any MZ in the architecture (located in layer l, q-th position from top) with the input and the output layers of the circuit. In the representation based on a directed graph, we would count in how many ways we can reach a node from the input (|{π l,m q }| IN ), as well as in how many ways n can reach the output (|{π l,m q }| OU T ). Intuitively, this corresponds to the number of paths that lay in the light cone of a MZ (summarized by I n and O n in Sec. III). This information can help evaluate to what extent a MZ affects scattershotlike applications 45 , where light is injected and measured in all input and output modes. We will return to this point in Sec. VII B.
We tackle this task by considering the number of paths (|{π l,m q q }|) that connect a MZ in the first layer (q -th from top) with any MZ in the circuit (l ≥ 1, q-th from top). Notice that each MZ in the first layer can be reached by two input modes, so a factor of 2 may be introduced to describe the problem in terms of modes. In both cases, we consider again the Catalan's trapezoid in Eq. 11 and look for a combination (S l,m q q , A l,m q q , B l,m q q ) G that fits the problem. Here, the superscript G serves to remember that we are describing the problem in terms of nodes in a graph. After some experimentation, we find by inspection that the following expressions solve the problem for circuits with odd size m, i.e.
Similar considerations lead to an analogous of Eq. 21 for even m and for the architecture. Indeed, by using Eq. 11 it is not hard to see that for l = 1 we have |{π 1,m (no path connects two MZs in the same layer). In addition, when l = m = 2r + 1 (r = 0, 1, 2, . . . ) we retrieve the same expression in Eq. 14, as desired. As mentioned, the only difference is that Eq. 14 refers to input-output modes while here we refer to MZs. To match Eq. 14 and Eq. 20 it is sufficient to observe that q = i+1 which provides a baseline to study the centrality of a node. This task will be the focus of the next section.

B. Centrality measures in an optical mesh
In this section, we turn our attention to the impact of individual MZs in realistic implementations. The objective is to identify the most important nodes within the optical meshes, according to a suitable real-valued measure of importance. To this end, motivated by the analogy between light propagation in a noisy circuit (where each MZ can introduce noise into the evolution) and information spreading, here we will consider tools from graph theory and network analysis, where this notion of importance is generally called centrality. The research question we ask is whether network analysis can help understand what noisy MZs are more likely to influence the performance of a circuit, without any information on the underlying quantum process. In other words, is it possible to infer the influence of a MZ by just looking at the network topology?
Centrality measures rank nodes according to their flow (for different types of flow) or to their involvement in the cohesiveness of the network. They are only good within the application they were designed for, and they are likely to disagree otherwise. These rankings provide the order of the most important nodes (although not the relative importance), while little can be said for the majority of the others. This limitation has led to the development of measures to assess the influence of all nodes in a network, e.g. the accessibility (how accessible the network is from a node using random walks) and the expected force of infection. This metric is more related to the flow discussed in Sec. VII A and in Eq. 4. Another aspect of centrality measures is that they operate from a node-centric perspective, i.e. they don't consider the synergy between neighbor nodes (which plays a major role in disease and information spreading). In this case, game-theoretic approaches are usually more effective. Centrality and node influence can be correlated but they are in general distinct.
Given these considerations, we computed various measures of centrality for each MZ 50 : closeness centrality, betweenness −, degree −, eigenvector −, Katz −, PageRank −, HITS −, radiality −, status −, edge betweenness −, and LinkRank −. Results for some of them -those with the most interesting behaviour -are shown in Fig.  10. Contrary to our expectation, we did not find a strong correlation with the reported results.
We also carried out additional numerical simulations to single out the most influential MZs, by adding realistic levels of Gaussian noise on each phase shifter one by one (assuming all the other elements are ideal or nearly ideal). In all these cases, a Monte Carlo simulation using the Fidelity as figure of merit did not reveal any consistent pattern of resilience: the reduction in the Fidelity was too small, comparable to the average fluctuation. We also simulated an imperfect implementation where the error on each phase was proportional to the magnitude: in this case, we indeed recovered the expected pattern due to the phase distribution for Haar-random unitaries 25 . We believe that more informative analyses should be tailored to specific architectures, to circumvent the bias induced by the Haar measure.

C. Calibration
In this section we outline an algorithm to calibrate meshes. While the idea was already sketched in previous works [17][18][19]48,49 , to the best of our knowledge no detailed procedure was proposed that is amenable to immediate reuse and improvement. Understanding this operation is also useful to evaluate the results in Fig. 7, and gauge how different approaches may induce different errors. In principle, calibration could also be seen as an optimization problem, where one looks for the set of parameters that best reproduces all measurements, according to a suitable figure of merit. To get a rough understanding of the feasibility of this approach, we attempted a calibration with a simple, unpretentious least-squares optimization (LSO). Specifically, we (i) considered a circuit of increasing size m, (ii) simulated single-photon (P 1 ) and two-photon (P 2 ) output probability distributions for a set of random parameters {ξ}, and (iii) compared P 1 , P 2 with some P 1 and P 2 obtained by adding Gaussian noise on both transmissivities and phases. Before this preliminary test, our guess was that, for sufficiently small m and low noise, a LSO would return a good estimate of the noiseless parameters. Contrary to our expectation, we found that, even though LSO does a very good job in minimizing the residuals, it returns parameters that can be quite different from the original ones. In addition, both the quality of the result and the required evaluation time become rapidly unfavorable for circuit sizes as low as m ∼ 7 (the number of parameters scales quadratically in m). While we emphasize that one cannot draw any conclusive consideration from this simple test, we gain evidence that (a) specialized strategies are likely to perform significantly better, and (b) simple LSO-like calibrations can fail due to the large number of parameter combinations that approximate the desired output. Importantly, (c) using the output of a LSO to study specific input-output scattering processes may result in large errors, since {ξ} was estimated from a global optimization problem.
The proposed approach consists of two parts (see Algorithm 1). (i) The first two loops serve to characterize the phase shifters in each MZ, by sequentially focusing on the elements below and above the main diagonal of the circuit, respectively. Specifically, each iteration selects a path in the circuit and calls one subroutine (Algorithm 2) that characterizes the phase shifters inside the MZs by LSO. This step reveals what voltages correspond to the bar (V − ; MZ − ), cross (V × ; MZ × ) and balanced (V 50 ) states, which allow to guide light to specific regions of the circuit. Some of these steps can commute and can be parallelized for a faster execution on large circuits. (ii) The last two loops in Algorithm 1 provide information on the phase shifter before each MZ. First, the linear dependency on the applied voltage is characterized (yielding the β coefficients). Once these are known, we can remove the voltage and focus on the offsets ψ 0 , which completes the calibration.