Generalized multi-photon quantum interference

Non-classical interference of photons lies at the heart of optical quantum information processing. This effect is exploited in universal quantum gates as well as in purpose-built quantum computers that solve the BosonSampling problem. Although non-classical interference is often associated with perfectly indistinguishable photons this only represents the degenerate case, hard to achieve under realistic experimental conditions. Here we exploit tunable distinguishability to reveal the full spectrum of multi-photon non-classical interference. This we investigate in theory and experiment by controlling the delay times of three photons injected into an integrated interferometric network. We derive the entire coincidence landscape and identify transition matrix immanants as ideally suited functions to describe the generalized case of input photons with arbitrary distinguishability. We introduce a compact description by utilizing a natural basis which decouples the input state from the interferometric network, thereby providing a useful tool for even larger photon numbers.

photons, through a randomly chosen passive linear network and yields an output distribution relying on non-classical interference [9]. Such samplings cannot be efficiently simulated on conventional computers, due to two reasonable conjectures: the Permanent-of-Gaussians conjecture and the Permanent Anti-Concentration Conjecture. Furthermore the existence of a classical polynomial-time randomized algorithm would imply a major collapse of the computational complexity polynomial hierarchy. The improbability of such a collapse indicates instead that efficient approximate experimental BosonSampling would falsify the extended Church-Turing thesis. This thesis states that problems efficiently solvable by physical devices are also solvable efficiently by a probabilistic, but non-quantum, Turing machine. Consequently efficiently scalable experimental BosonSampling would present strong evidence that quantum computing is superior to classical computing based on probabilistic Turing machines.
The importance of experimentally testing the superiority of quantum computing over classical computing has been the impetus of several impressive photon interferometry experiments using integrated [3][4][5] and fiber-based systems [2].
These experimental benchmarks of optical BosonSampling stimulated discussions [10,11] whether BosonSampling output distributions can be discriminated from uniform distributions in the large photon-number regime. Recent experimental comparisons [12,13] of output distributions originating from classical and quantum multi-photon interference strongly indicate that output distributions from BosonSampling computers are not uniform. However, for more accurate studies the influence of errors, which is still under investigation [6,14,15], needs to be assessed. Aaronson and Arkhipov, in their seminal theoretical introduction of the BosonSampling problem and its potential implementation, presciently identified five "obvious" errors that have to be considered in experimentally realizing BosonSampling. These errors are: imperfect preparation of the n-photon Fock-state, inaccurate description of the interferometer, photon losses, imperfect detectors and non-simultaneity of photon arrival times. Photon losses and imperfect detection can be ameliorated for demonstrations of principle by post-selection techniques, and inaccuracy of the description of the transition matrix can be minimized by stabilisation and process tomography. Consequently imperfect preparation of photons and nonsimultaneity are the most insidious of the five problems and need careful study. State of the art multi-photon sources face intrinsic challenges in terms of their photons' distinguishability and well-defined photon number states, both affecting crucially the performance of Boson-Sampling computers. Photon-number-resolving detectors with near unit detection efficiency [16][17][18] provide the quantum technology for heralding precise photonnumber input states [19], rendering one source of imperfection rather unlikely in the near future. In contrast the achievement of indistinguishable photons is still experimentally demanding by requiring precise control of various parameters such as spectral-, temporal-, spatialand polarization properties [20]. In the classical case of all photons being distinguishable the resulting output distribution of a sampling-computer can always be efficiently calculated on a conventional computer.
In this work we manipulate the distinguishability of arXiv:1403.3433v2 [quant-ph] 28 Mar 2014 single photons by controlling their relative time delays. Here we construct a theoretical framework capable of determining the quality of BosonSampling for arbitrary degrees of distinguishability. Our approach contrasts with all previous experiments that considered only the extreme cases of perfect classical or quantum conditions. We further show that the output distribution depends not only on the permanent -as in the case of an ideal BosonSampling computation. It rather depends on all immanants of the interferometer transition matrix for the specific representations given by the number of the input photons n and the number of modes m.
The simplest instance of BosonSampling is the Hong-Ou-Mandel dip [21], wherein two photons are injected into distinct input ports of a beam splitter, which is effectively an n = m = 2 interferometer. One element of the output probability distribution, the case where the two photons exit the beam splitter in different output ports, is recorded via a coincidence measurement. This coincidence rate depends on the transformation matrix, here defined by the splitting ratio of the beam splitter, and the distinguishability of the photons. In the iconic example of a balanced, i.e. 50/50, beam splitter and indistinguishable photons the coincidence rate vanishes. The established technique to calibrate for the point of maximal non-classical interference relies on tuning the relative temporal delay, i.e. the distinguishability between the two photons. Mathematically the relevant function is an overlap integral, which accounts for the photons' key properties such as spectral shape, polarization, spatial mode and relative temporal delay. Any distinguishability due to causes other than delay can be treated as mode mismatch, which reduces non-classical interference according to this mode-overlap integral. The coincidence rate for an arbitrary beam splitter described by a transformation unitary B corresponds to where 0 ≤ ζ ≤ 1 is derived from the mode-overlap integral and ξ is a factor describing the shape of the interference feature (see supplementary information for further details). The permanent (per) and determinant (det) are special cases of the immanant, imm(M ) = σ χ(σ) i M iσ(i) for M ij matrix elements of M , with χ(σ) the character of permutation σ: χ(σ) = 1 for the permanent and χ(σ) = sgn(σ) for the determinant. For perfect mode matching, P 11 (0) = |per(B)| 2 , which is zero for a balanced beam splitter. Controllable distinguishability can thus be understood as varying the relative weights of |per(B)| 2 and |det(B)| 2 in equation (I). This weighting of permanent and determinant in the coincidence rate generalizes to relative weightings of immanants for more than two modes [22,23].

II. EXPERIMENT
Photonic BosonSampling extends such non-classical interference to more photons n and even larger interfer-ometric networks with m > n channels. We implement such a network for n = 3 and m = 5, and test the effect of distinguishability by controlling relative time delays between the three photons. Coincidences are measured at three of the five output ports as a function of the relative delays ∆τ 1 between the first and second photon and ∆τ 2 between the second and third photon. There exist ten possible ways of injecting three photons into a fivemoded interferometer and similarly ten different ways for the three photons to exit this network, resulting in 100 different combinations of input and output ports. Each combination specifies a 3 × 3 submatrix R and the coincidence rates, given by equation (1), of this scattering event can be visualized as a three-dimensional landscape with the (∆τ 1 , ∆τ 2 ) plane serving as the domain. In the limit ∆τ 1 = ∆τ 2 = 0 and perfect indistinguishability the ten landscapes for one input combination collapse to the ten post-selected probabilities of a BosonSampling output distribution. The coincidence rate of each landscape corresponds to =v † 1 1 + 12 ζ 12 e −ξ12∆τ 2 1 + 23 ζ 23 e −ξ23∆τ 2 2 + 13 ζ 13 e −ξ13(∆τ1−∆τ2) 2 + 123 ζ 123 e −ξ123(∆τ1,∆τ2) + e −ξ * where R denotes a transformation by a 3 × 3 submatrix andâ † 1 (ω),â † 2 (ω ) andâ † 3 (ω ) the creation operators in modes 1, 2, and 3 for photons with different spectral shape functions dependent on the frequency variables ω, ω , ω . An expression of equation (1), expanded in terms of immanants, determinants and permanents, results in a linear superposition of 60 terms. However, choosing a different decomposition allows a compact presentation as shown in equation (2) (see methods and supplementary information for further details).
Here the immanants, permanent and determinant are contained in a six-dimensional vector v. Analogous to equation (I) ζ terms are derived from the mode-overlap integral and ξ terms are factors describing the shape of the interference feature. The subscripts label the input modes of the photons, and 12 , 13 , 23 and 123 are permutation matrices describing the symmetry under exchange of the photons.
Distinct regions in a landscape are related to different physical behavior of the three photons (see Fig. 1). In the center (∆τ 1 = ∆τ 2 = 0), all photons exhibit maximal indistinguishability and the output probability distribution is predominantly proportional to the permanent. Any distinguishability, for example due to polarization or spectral mode-mismatch, leads to contributions from immanants and the determinant, even for perfect temporal overlap. This is reflected in equation (2) by the weighting factors ζ being smaller than one. Note that only perfect indistinguishability causes destructive interference of 54 of the 60 terms such that only the contribution from the permanent will show up in the output distribution. Outside the central region, three troughs or ridges of pairwise temporal indistinguishability (∆τ 1 = ∆τ 2 , ∆τ 1 = 0, and ∆τ 2 = 0) structure the landscape further. These emerge as a result of non-classical interference between two of the three photons. The output probability distribution is now a sum containing the permanent, determinant and some immanants. The weighting of those is dominated by the temporal overlap dependent on ∆τ 1 and ∆τ 2 , whereas the weighting factors ζ are small in comparison. Note that contributions from the determinant appear for partial distinguishability of all interfering photons only, originating for example from modal mismatch. The remaining areas form plateaux, where all photons exhibit complete temporal distinguishability.
Here the temporal overlap functions converge to zero which renders the output probability completely independent of any weighting factors, often referred to as classical behaviour of the photons.

III. CONCLUSION
In Fig. 1 we illustrate the central result of our experiment. We sampled the output probability for six different sets of temporal delays, and therefore distinguishability, for two different combinations of output modes (see supplementary information for detailed information). The predicted output probabilities are calculated from the reconstruction of the unitary matrix describing the interferometer. We find a reduced χ 2 of 1.38 and 1.10 for the two different combinations of output modes. Small deviations from the model can be attributed to higherorder emission, the influence of loss and inaccuracy in the reconstructed description of the interferometric network. Our method allows the influence of photon distinguishability to be quantified and is independent of the source of the distinguishability. It is universally applicable to any linear-optical network processing single-photons and exhibiting non-classical interference. By relating the probability distributions to the transition matrix immanants tighter estimates for the underlying BosonSampling distribution can be achieved. Our method is not only valuable for assessing the quality of a BosonSampling computer but also shows a way to expand intermediate computations to more general sampling problems based on immanants.

IV. METHODS
Three-photon coincidence probability Vector v in equation (2) is defined as: where R ijk is R with rows arranged in order i, j and k.  The intersections of the inputs' columns and the outputs' rows uniquely select matrix-entries that constitute 3 × 3 submatrices. Tuning the temporal delay of the three photons with respect to each other (∆τ1 and ∆τ2) gives rise to coincidence landscapes (b) and e)). Their temporal distinguishability determines the degree of non-classical interference and therefore probability to detect such an event. Six characteristic points (P1 ... P6) of each landscape are experimentally sampled. Theoretical prediction (left bars, shaded) and experimentally obtained output probabilities (right bars) for the six points and both output combinations are shown in c) and f ). The reduced χ 2 is 1.38 and 1.10 respectively and the experimental errors are calculated as standard deviation of the mean.

State generation
power of this second harmonic generation can be controlled by a power regulation stage consisting of a halfwave plate (HWP) and a polarizing beam splitter (PBS) placed before the LBO-crystal. The resulting emission at 394.5 nm is focused into a 2 mm thick β-BaB 2 O 4 (BBO) crystal cut for degenerate non-collinear type-II down-conversion [24]. A compensation scheme consisting of HWPs and 1 mm thick BBO-crystals is applied for countering temporal and spatial walk-off. The two spatial outputs of the down-converter pass through narrowband interference filters (λ FWHM = 3 nm) to achieve a coherence time greater than the birefringent walk-off due to group velocity mismatch in the crystal (|v ge − v go | × half crystal thickness). Additionally this renders the photons close to spectral indistinguishability. The down-conversion-source is aligned to emit the maximally entangled Bell-state |φ + = 1 √ 2 (|HH + |V V ) when pumped at 205 mW cw-equivalent pump power. The state is coupled into single mode fibers (Nufern 780-HP) equipped with pedal-based polarisation controllers to counter any stress-induced rotation of the polarisation inside the fiber.
Each of these spatial modes is then coupled to one input of a PBS while its other input is occupied with a vacuum-state. The outputs pass HWPs and are subsequently coupled to four polarisation maintaining fibers (Nufern PM780-HP). Temporal overlap is controlled by two motorized delay lines that exhibit a bidirectional repeatability of ± 1 µm. Temporal alignment precision is limited by other factors in the setup to approximately ± 5 µm and is therefore within a precision of 2.5 % of the coherence length of the photons. The polarisation main- Figure 2. Experimental setup. Four photons are generated via spontaneous parametric down-conversion (SPDC) and distributed to four spatial modes with two PBSs. A four-fold coincidence event consisting of three photons exiting the network and a trigger event postselects the desired input state. The delay lines allow to tune the distinguishability and therefore quantum-interference of the three photons propagating through the waveguide. The integrated circuit is shown in a Mach-Zehnder-decomposition and consists of eight beam splitters and eleven phase shifters.
taining fibers are mated to a single mode fiber v-groovearray (Nufern PM780-HP) with a pitch of 127 µm and butt-coupled to the integrated circuit. The coupling is controlled by a manual six-axis flexure stage and stable within 5 % of the total single-photon counts over 12 hours. The output fiber array consists of a multimode vgroove-array (GIF-625) and the photons are detected by single-photon avalanche photodiodes which are recorded with a home-built Field Programmable Gate Array logic. The coincidence time window was set to 3 ns. In order to measure the six points of the coincidence landscapes a three-photon input state was injected into the integrated network (see supplementary information for further details). Therefore the BBO was pumped with cw-equivalent power of 700 mW and the ratio of the sixphoton emission over the desired four-photon emission was measured to be below 5 %.
Integrated network fabrication. The integrated photonic networks were fabricated using a femtosecond direct-write writing technology [25,26]. Laser pulses were focused 370 µm below the surface of a high-purity fused silica wafer by an NA = 0.

Two-photon non-classical interference
Two photons injected into different inputs of an arbitrary beam splitter or a network built from arbitrary beam splitters and phase shifters will interfere non-classically [21,27]. This input state can be expressed as, for A † i (α i ) a creation operator for a photon with spectral function centered at time τ i . The frequency-mode creation operators on the right-hand side (RHS) of equation (5) satisfy the commutator relation with 1 the identity operator. For two photons it is sufficient to define their relative temporal delay as ∆τ = τ 1 − τ 2 . Only in the case of ideal bosonic particles exhibiting no modal mismatch and perfect temporal overlap, i.e. ∆τ = 0, does the RHS of equation (7) become the well-known bosonic commutator relation describing perfect symmetry under exchange. When the two-photon input state (see equation (4)) is mixed via a transformation matrix B = U 2×2 and projected on an output where the two photons exit in different modes, the output probability becomes, denoting factors arising from the spectral overlap integral. For the case of indistinguishable photons (ω c,1 = ω c,2 , σ 1 = σ 2 or ∆τ = 0), the output probability is only proportional to the permanent. This is a function symmetric under permutation of rows of the transformation matrix arising in photon interferometry due to bosonic exchange symmetry. However, with loss of complete indistinguishability (ω c1 = ω c2 , σ 1 = σ 2 and ∆τ = 0), equation (9) becomes proportional to a combination of the determinant and the permanent. This is a consequence of the input state losing its symmetry under exchange.
The vector v contains all the immanants and the determinant and permanent of R: The six matrices are, in fact, permutation matrices reduced to block-diagonal form. The change of basis leading to this block-diagonal form also induces a transformation of the original basis to a new basis where the new basis vectors are components of the vector v above.
Equation (16) describes the same features as equation (13); however equation (16) is given in a maximally decoupled basis, which allows for a very compact notation. The terms originating from the overlap integrals (ζ terms and ξ terms) contain all the information on the physical properties of the interfering photons. The interferometric network is described by R and the effect of the permutation symmetry of the photons is included in the vector v and the permutation matrices ρ. In the extremal case when all the photons are indistinguishable, i.e., we have ζ ij = 1, ξ ij = σ 2 /2 and ν ij = ω so the output probability reduces from a superposition of 60 terms to just P 111 → |per(R)| 2 .

Matrix Reconstruction
The fabrication of integrated photonic networks using a femtosecond-laser-direct-writing technology works with high precision and high stability. Discrete unitary operators acting on modes can be realized solely from beam splitters and phase shifters [28]. These networks are arranged like cascaded Mach-Zehnder interferometers shown in Fig. 3. Notably though, even advanced writing precision can introduce small deviations from the initially targeted values of individual elements. In our case this writing precision is limited to around 50 nm over the whole length of the waveguide (in this experiment 10 cm). In a cascaded interferometric arrangement small deviations of individual elements may add up to a noticeable deviation in the overall transformation. The splitting ratio of individual directional couplers is set by their mode separation and coupling length. Both characteristic variables are three orders of magnitude bigger than the positioning precision and therefore unaffected by it. Unfortunately small length fluctuations due to the positioning precision can introduce unintended phase shifts. In the worst case, i.e. a phase shifter spanning the whole length of a waveguide, the resultant phase shifts can even reach π/8. The layout used for the interferometric networks reported here (see Fig. 3) circumvents this worst case. Even if the unintended phase shifts are decreased by a factor of 3 at least; their influence needs to be evaluated and the actually implemented unitary needs to be reconstructed. The characterization procedure we use builds on the one introduced in [4,29]. Two-photon states from a down-conversion source are injected into different modes of the optical network to be characterized. This in situ method allows for a characterization with states having the same physical properties, e.g. frequency and spectral shape, as used later in the experiment.

a. Estimating the visibilities of submatrices
We assume the optical interferometer can be described by a 5 × 5 unitary matrix and we reconstruct its transformation via visibilities measured by injecting two photons into any combination of two of its five inputs. The visibility for two photons entering input modes i, j and exiting in the output modes k, l can be calculated from the 2 × 2 submatrix U i,j,k,l . For five input and output modes this results in 5 2 × 5 2 = 100 possibilities. Owing to the structure of the interferometer (see Fig. 3), a photon injected into port 5 cannot exit from output 1'. This leads to a visibility of zero for the four input pairs ij = 15, 25, 35, 45 and the output pairs kl = 15, 25, 35, 45. These visibilities are omitted from this reconstruction algorithm, so the unitary transformation is reconstructed from 84 non-zero visibilities.
Our interferometric network consists of eight beam splitters and eleven phase shifters. Each beam splitter imple-ments a SU(2) transformation with matrix representation: where β is the Euler angle associated with the transmittivity η via the relationship η = cos 2 (β/2). Note that in equation (21) the beam splitter also implements a relative phase shift of π between the first and second mode. The eleven phase shifters produce additional phases in their respective modes. Each phase shifter has a matrix representation of with α i the phase shift in mode i. The spectral shape of the photons is measured with a single-photon spectrometer (Ocean Optics QE6500) and to a good approximation is of Gaussian shape. Such Gaussians are defined by only two parameters, namely their central frequency and the variance, which for the i th photon of the input pair is given by equation (6), and expressed here as Assuming both photons exhibit identical spectral function, i.e. |φ 1 (ω)| 2 = |φ 2 (ω)| 2 , and the detectors are modeled by the detection positive-operator valued measure (POVM) with two elements {Π 0 , Π 1 } satisfying completeness, then the visibility is with and U a,b i,j,k,l denotes the element in the a th row and b th column of the matrix U i,j,k,l . In an experiment the two photons will always have slightly different spectral functions whose mismatch needs to be accounted for. The central wavelengths and spectral bandwidths of the photons used in this characterization measurement are λ c,1 = 789.05 nm, ∆λ 1 = 2.9 nm, and λ c,2 = 788.60 nm, ∆λ 2 = 2.9 nm respectively. The coincidence counts N c as a function of time delay t and spectral mode mismatch are where Y 0 , A, t c and T are parameters to be fitted to the experimental data. The experimental data for a given input/output combination i, j, k, l it is typically recorded for 30 increments with a stepwidth of 66 fs and integrated over 800 s each step. The coincidences are read out by a field-programmable gate array logic (FPGA logic). As individual delays are set by translating a fiber coupler with a motorized screw (Newport LTA-HL) there can be a small drift in coupling efficiency over the whole delay-range of 2000 fs. Without this drift, the background of the visibility would be a horizontal straight line. For drifts smaller than 5 % of the two-photon flux the drift is in good approximation linear and can be modelled with an additional parameter, T . The positioning precision of the delay lines is limited to approximately ± 5 µm which is within 2.5 % of the coherence time of the interfering photons. When the two-photon input state is generated via down-conversion pumped by a pulsed laser system, higher order emission can lead to unwanted contribution to the input state. The first higher order, which is a four-fold emission, causes a small contribution of two photons in each input mode during the characterization of a 2 × 2 submatrix. This can add a constant background to the two-fold coincidences in the following scenario: two photons in one input mode are lost and the two photons in the other input mode leave the network in different output ports. We measure such contributions by blocking one of the two input-modes and recording the two-photon coincidences at the output. These signals are labelled HO 1 and HO 2 respectively and subtracted from the data. The background coincidence rate d may be interpreted as a contribution to N c stemming from dark counts due to electrical noise and background light. This rate d is also present in HO 1 and HO 2 . Therefore it has to be added to equation (27) to account for all unwanted coincidences only once. The error for the raw data was verified to be Poissonian. For the data processing the error of the higher order term (HO 1 + HO 2 − d) and the abscissa-error caused by the limited alignment precision of the delay lines need to be taken into account additionally. These errors provide weighting in the minimization algorithm and influence the standard errors of the fitted parameters. The visibility, is finally calculated from the parameters Y 0 and A, whereas the width of the dip or peak is fixed by the spectral function of the two photons. Only 84 out of 100 visibilities are non-zero and their value, {V (exp) i , i = 1, . . . , 84} and standard deviation, {σ i , i = 1, . . . , 84} are extracted via the procedure outlined above. The resultant data fit with theory exhibits χ 2 reduced = 1.74 [30]. An example for one of the 84 datasets is shown in Fig. 4.

∆τ[fs]
-  Figure 4. Example for one dataset used for the reconstruction of U5. The best fit of equation (27) to the data-set is shown in blue. Here the visibility is calculated from the best fit parameters Y0 and A. The reduced χ 2 resulting from the fit shown in blue is χ 2 blue = 2.02. Fluctuations in the count rate for values of |∆τ | >500 fs drive the reduced χ 2 away from 1. These fluctuations can be interpreted as the random noise background in the lab and the increased reduced χ 2 is reflecting that. However, the precision of the fitted parameters Y0 and A and ultimately the extracted visibility V is only marginally affected by these fluctuations. The 5 × 5 unitary description of the interferometric network is reconstructed from 84 of these visibilities. The curve in green, N (th) c (t) (see equation (33)) is calculated from four matrix entries of the reconstructed unitary and results in an overlap with the data of χ 2 green = 2.70. The agreement of these two curves and corresponding reduced χ 2 s is a qualitative measure for the precision of the reconstruction.

b. Parameter estimation and reconstruction of the unitary matrix
A unitary transformation of a linear optical network can be reconstructed from single-photon transmission probabilities and two-photon interference visibilities [29]. A technique using coherent states [31] follows a similar approach. Both techniques reconstruct the unitary description in a dephased representation where the single-photon or single-input coherent state data is used to estimate the real parts of the matrix-entries. The imaginary parts of the matrix-entries are reconstructed from the two-photon interference visibilities or directly from the relative phase shifts. When the layout and initially targeted parameters of the building blocks (see Fig. 3) of the interferometric network are known, their actual parameters can be fitted alternatively. Our technique uses an over-complete set of visibilities and the parameters of the interferometer that give an optimal fit to the experimentally measured visibilities are obtained using a least square optimization weighted with the standard errors of the experimental visibilities (see 3 a for details). Eight of the 19 parameters are transmittivities, β 1 , β 2 , . . . β 8 , and eleven are phases, φ 1 , φ 2 , . . . φ 11 . To find the best-fit set of parameters, the data was processed with a Matlab program that uses fmincon to minimize the function V opt , where V (th) i is the theoretical value of the visibility calculated from our special unitary model of the interferometer using equation (25) for the i th data set, and Γ is a constant value equal to (number of data sets in visibilities − number of parameters − 1) = 2522 − 19 − 188 − 1 = 2314. Equation (29) looks similar to a reduced χ 2 but has to be interpreted differently. A value close to 0 is desirable and indicates good agreement between experimentally extracted and theoretically predicted visibilities. In our case the result is V opt = 0.351.
The 5 × 5 reconstructed matrix U 5 using the procedure outlined above is Using this matrix, the probability of coincidence counts, P (th) 11 , can be predicted for any two-photon inputs and outputs. For the inputs i and j, i < j and outputs k and l, k < l, this reads as where and U ab 5 is the element in the a th row and b th column of U 5 . The actual coincidence count is then where N 0 , t c and T are parameters used to find the best fit to the experimental data. The exact χ 2 red is calculated using where m = 3030, ν = m − 20 − 100 − 1 = 2909, i is the error for the corresponding datapoint, and N (exp) c,i denoting the experimental data corrected for higher order emissions. The sum is taken over the data set and the index labels the data. The obtained χ 2 red between the data and the predicted coincidence counts using U 5 is This value should be compared to reduced χ 2 exp = 1.74 obtained by fitting the primary data to extract the 84 visibilities in the beginning (see 3 a). The difference between those two reduced χ 2 s can be attributed to accuracy of the reconstructed unitary matrix U 5 . While fluctuations in the count rate for values of |∆τ | >500 fs drive both reduced χ 2 s away from 1, the difference between the two reduced χ 2 s is relatively small (≈ 0.35). An example for one of the 100 data sets is shown in Fig. 4.

State Generation
We use an 80 MHz Ti:Sapphire oscillator emitting 150 fs pulses at a wavelength of 789 nm which get frequency doubled via a LiB 3 O 5 (LBO). The upconverted beam is focused into a 2 mm thick β − BaB 2 O 2 (BBO) crystal cut for degenerate non-collinear type-II spontaneous parametric down-conversion. To achieve near spectral indistinguishability and enhance temporal coherence of the down-converted wave packets the photons are filterd by λ FWHM =3 nm interference filters. The source is aligned to emit the maximally entangled state when pumped with low pump power (200 mW cw-equivalent). H and V denote horizontal and vertical polarization and a and b are the two spatial emission-modes. When pumped with higher pump powers (700 mW cw-equivalent) noticeable higher order emission occurs: This state is guided to two polarizing beam splitter (PBS) cubes. A detection event in the trigger mode a heralds the generation of either the state |V a |V b |H b or |HH b (see Fig. 5). Only in the first case are the three modes a , b , and b occupied with one single photon, whereas in the latter case mode b is occupied with two photons and mode b with vacuum. Post-selection on a four-fold coincidence between mode a , a , b , and b allows for the heralding of the desired input state where only one photons enters each input mode. The half-wave plates in mode a and b are set to 45 • to render them indistinguishable in polarization from the other photons. This heralding scheme holds independently of any transformation for the photons in mode a , b , and b as long as it acts on spatial modes, e.g. consisting of beam splitters and phase shifters only. Figure 5. State generation. A pump beam is focused into a 2 mm β-BaB2O2 (BBO) crystal cut for non-collinear, degenerate, type-II down-conversion. The generated state is emitted into the spatial modes a and b. A compensation scheme consisting of half-wave plates (HWPs) and 1 mm thick BBO crystals is applied for countering temporal and spatial walk-off. Narrowband interference filters (λFWHM = 3 nm) are applied to increase the temporal coherence of the photons and render them close to spectral indistinguishability. The modes a and b are subsequently split by polarizing beam splitter cubes (PBS) and two halfwave plates in their reflected ports are set to 45 • to ensure the same polarization in all four output modes (a , a , b , and b ). With this scheme three indistinguishable photons in mode a , b , and b each can be heralded from a four-fold emission by a successful trigger event in mode a .

Analysis of the three-fold coincidence data
Three photons are inserted into input modes 1,2 and 4 of the interferometric network. The spectral characteristics of these photons were measured using a single-photon spectrometer (Ocean Optics QE6500) and are in good approximation of Gaussian shape. Note that this spectral data differs slightly compared to the characterization measurements (see 3 a). This spectral data allows to express the mode overlap integrals in dependence of the time delays ∆τ 1 and ∆τ 2 between the first and second photon and the second and third photon respectively. The theoretical prediction for the output probability in any of the ten three-fold output ports is then calculated using equation (16). Consequently each 3 × 3 submatrix R is constituted by matrix elements selected by the input and output ports. The output probability (see equation (16)) of any landscape contains a constant term and four terms proportional to different mode overlap functions. By sampling six points of pairwise temporal delay of ∆τ 1 and ∆τ 2 contributions of each of these terms can be assessed. These six points are ∆τ 1 ∆τ 2 P1 0 fs 130 fs P2 0 fs -870 fs P3 -300 fs -170 fs P4 -1000 fs -870 fs P5 -1000 fs 130 fs P6 -1000 fs 1130 fs An offset of ∆τ off = 130 fs is introduced in the temporal delay mode ∆τ 2 , otherwise the delays are set to combinations of 0 fs, −300 fs and ±1000 fs. Precision of the temporal alignment was estimated to be ±16 fs. In one measurement run the points P1 to P6 are recorded consecutively for two hours each. To account for effects of drift this order is reversed in the next measurement run, therefore the points are recorded in the order P6 to P1. The four-fold count rates range from 1 mHz to 100 mHz dependent on the output combination. In between each measurement run the setup was realigned to optimize for maximal count rates. In order to obtain sufficient statistics, the whole data acquisition is repeated over 19 measurement runs for a total of 228 hours. As Poissonian error modeling results in too optimistic error bars in case of long data acquisition due to multiple sources of error, we adapted the error modeling. The 19 measurements are independent runs therefore mean and standard deviation of the mean provide more useful information. Each individual measurement run is represented as a six-dimensional vector, with the i th entry of the vector containing the four-fold counts of the P i th delay point integrated over two hours. These vectors can then be normalized to unit vectors thereby obtaining relative output probabilities. The mean and the standard deviation of the mean can now be calculated for each of the six delay points. Ultimately the overlap with the theoretical prediction is obtained by a least squared minimization weighted with the standard deviations of the mean. Here a linear scaling factor is introduced relating the relative experimental probabilities to the absolute theoretical ones. The goodness of fit is calculated using the reduced χ 2 . The number of degrees of freedom is in this case ν = 6 − 2 = 4.