Photonic quantum walks with four-dimensional coins

The dimensionality of the internal coin space of discrete-time quantum walks has a strong impact on the complexity and richness of the dynamics of quantum walkers. While two-dimensional coin operators are sufficient to define a certain range of dynamics on complex graphs, higher dimensional coins are necessary to unleash the full potential of discrete-time quantum walks. In this work we present an experimental realization of a discrete-time quantum walk on a line graph that, instead of two-dimensional, exhibits a four-dimensional coin space. Making use of the extra degree of freedom we observe multiple ballistic propagation speeds specific to higher dimensional coin operators. By implementing a scalable technique, we demonstrate quantum walks on circles of various sizes, as well as on an example of a Husimi cactus graph. The quantum walks are realized via time-multiplexing in a Michelson interferometer loop architecture, employing as the coin degrees of freedom the polarization and the traveling direction of the pulses in the loop. Our theoretical analysis shows that the platform supports implementations of quantum walks with arbitrary $4 \times 4$ unitary coin operations, and usual quantum walks on a line with various periodic and twisted boundary conditions.

While the initial definition of DTQWs assumed translation invariant and time independent dynamics, more versatility can be obtained by spatial and temporal control of the quantum walk parameters. By varying the coin operation such systems have been used experimentally to observe Anderson localization [21,23,33], dynamical localization [26], topological phases [34][35][36][37][38][39][40][41][42], and other fundamental effects such as recurrence [43] and revivals [44]. The dynamic control of the coin operation can be extended to engineering the topology of the graph on which the walk takes place: finite [31] and percolation * sonja.barkhofen@uni-paderborn.de graphs [45], and lines with periodic boundary conditions [46] have been demonstrated experimentally.
To have any effect on the walker dynamics, the minimum required dimensionality for the coin space is two. In order to reduce the required theoretical and experimental effort associated with the study of higher dimensional coins, many of the above works employ multi-step protocols, which use only two-dimensional coins. These protocols simulate higher dimensional coins by splitting up each step into multiple coin and shift operations acting on a two-dimensional coin space. They have found use not only in realizing dynamics on graphs embedded in higher dimensions, but also in 1D quantum walks on more sophisticated graphs, such as on percolation graphs or circles [45,46]. However, as the required doubling or even triplication of the necessary step numbers for the implementation of such multi-step schemes is experimentally disadvantageous in terms of losses, inaccuracies, and scalability, these protocols significantly impact the efficiency of the physical implementation.
Already on the one dimensional (1D) line DTQWs with higher dimensional coins have been shown to exhibit unique features not possessed by two-dimensional coins, among the most striking the so-called trapping [47][48][49]. While due to the simplicity of the 1D structure these may be regarded as toy systems, they can be efficiently used to demonstrate several fundamental differences between classical and quantum walks. Trapping can be used for instance in conjunction with dynamically controlled coin operators to shape the profile of the walker's wave packet, having no counterpart in classical random walks. In the case of more complex graph topologies (e.g. graphs embedded in higher dimensional spaces, or nonregular graphs) the dimensionality of the coin space will provide the critical ingredient for more involved or even unexpected applications. For example, DTQWs with genuine four-dimensional coins on structures embedded in the two-dimensional (2D) space admit phases analogous to the quantum spin Hall (QSH) phases [50,51], offering significantly new applications over the phases accessible in 1D [50,52]. Another example is that of the Grover walk on a 2D grid, exhibiting dynamics composed of a spreading and a localized part [53,54], of which only the spreading part can be reproduced by two-dimensional coins [55]. These limitations of two-dimensional coins provide a strong motivation to achieve efficient implementations of quantum walks with genuine higher dimensional coin operators while maintaining precise dynamic control. While there have been theoretical proposals [56,57], and limited experimental realizations of higher dimensional coins [19,23,25,58], no universal scalable platform has been demonstrated yet.
In this work we present experimental implementations of DTQWs on a line governed by programmably controlled four-dimensional coins, reaching beyond the previous two-dimensional definition and demonstrating QWs on new complex graph topologies. At the heart of our time-multiplexing scheme is an interferometer arranged in a Michelson-type geometry, in contrast to earlier implementations based on a Mach-Zehnder geometry. While offering identical stability and versatility, the present setup introduces a new degree of freedom for the coin, namely the direction of propagation of two counterpropagating optical modes. Combining these with polarization supports the four-dimensional coin. The higherdimensional coin space and the temporal control of the coin operations enable us to efficiently realize DTQWs on non-trivial graphs of different sizes and topologies.
The structure of the paper is as follows. In Sec. II we introduce the experimental apparatus we use for realizing quantum walks with four-dimensional coins, point out its differences to the earlier time-multiplexing setup and detail the principle of operation. Sec. III formalizes the DTQW time evolution of the Michelson geometry and analyzes the attainable dynamics. In Sec. IV we present experimental results on the realization of quantum walks on various graph structures. First, we demonstrate the usual Hadamard quantum walks by restricting the dynamics into an invariant two-dimensional subspace with a suitable choice of the four-dimensional coin operator. Next, we realize a quantum walk with a genuine four-dimensional coin and observe the emergence of multiple lobes in the probability distribution, characteristic of translationally invariant DTQWs with higher dimensional coins. By dynamically controlling the coin, we extend 1D DTQWs with Hadamard and non-mixing coins onto circles of various sizes, and demonstrate effects of periodic boundary conditions, in particular the equidistribution of the quantum walker. Finally, we present a quantum walk on a minimal example of a Husimi cactus graph consisting of a pair of connected circles, resembling a figure eight, with the connecting node characterized by a four-dimensional operator. We discuss the significance of the results and provide an outlook in Sec. V.

II. EXPERIMENTAL APPARATUS
The layout of our experiment, depicted in Fig. 1, resembles a Michelson interferometer closed by a loop. The coherent laser pulse (wavelength 1550 nm) plays the role of the quantum walker, using the mathematical equivalence between wave dynamics and single particle quantum dynamics [59]. The input pulse is coupled into the loop by a beam splitter with low reflectivity R ≈ 1 % ensuring high transmittivity for the traveling pulses. The loop allows two propagation directions, clockwise (c) and counter-clockwise (cc), and for each direction we can distinguish two orthogonal polarizations, horizontal (H) and vertical (V) (cf. Fig. 2b). We label these four orthogonal modes by cH, cV, ccH and ccV. To control the dynamics of the pulses, we insert polarization rotating elements consisting of waveplates and fast-switching electrooptical modulators (EOMs) in the arms A and B as well as in the loop (cf. Fig. 1). The initial pulse with a well defined polarization is coupled into the modes ccH and ccV by the incoupler. The polarization of the pulse is rotated by the waveplate and the EOM before it reaches the polarizing beam splitter (PBS) and enters the arms A and B. After a reflection in the arms the pulse reenters the loop and is split into the four available modes, depending on the arm's polarization rotation. For detection of the pulses, we place another weakly reflecting beam splitter (R ≈ 2 %) in the loop and use a pair of PBSs and four superconducting nanowire single-photon detectors to discriminate the four internal states. By using single mode optical fibers of different lengths (328 and 338 m) in the arms A and B, we can introduce a well-defined time delay of τ pos = 95 ns between pulses that took different arms. By choosing the time delay to be longer than the pulse widths (≈ 100 ps) and detector dead times (≈ 90 ns with 90 % recovery of the efficiency), we can resolve the outcoupled pulses with different delays and associate them to unique time bins. The roundtrip efficiency (i.e. the transmission from one step to another) in the looped interferometer is 63 ± 3 %, which significantly improves the performance of earlier setups with efficiencies of ≈ 40 % as presented in [31]. In order to achieve a good signal-to-noise ratio for high step numbers we perform measurements with two different initial power levels, which are then concatenated. This concatenation of two data sets is necessary since for a low power input the signal becomes too small after a small number of steps, while the high input powers cause detector saturation for the early steps and make a reliable probability extraction impossible. In each case we normalize the total intensity per step to one which is then equivalent to the walker's probability distribution. In Fig. 2 we illustrate the dynamics of the interferometer, based on the time-multiplexing technique. For reference we additionally provide the Mach-Zehnder-type geometry for the visualisation of the standard principle of time-multiplexing quantum walks as detailed in [18,31]. To understand the dynamics of the Michelson interferometer, it is instructive to follow what happens to pulses coming from the loop, impinging on the PBS in all four modes cH, cV, ccH and ccV at once. The PBS guides the pulses from modes cH and ccV into arm A, and from modes ccH and cV to arm B. In each arm the polarized pulses are rotated by optical elements implementing C A and C B respectively, and a relative time delay between the two different paths is introduced. Back at the PBS, the pulses are reflected or transmitted according to their polarization, such that e.g. the originally horizontal and clockwise traveling pulse, upon entering and leaving arm A, is mapped onto modes ccH and cV for the next loop iteration. A full roundtrip is thus defined by a rotation of the clockwise and counter-clockwise propagating pulses by the elements in the loop, followed by the mode-dependent rotation and delay in the two arms A and B.
Particularly simple dynamics can be observed if the optical elements in the arms A and B are set up such that the net effect of the double passage and reflection is a rotation of the pulse polarization by 90 • . In this case an initially counter-clockwise traveling pulse continues to travel in the counter-clockwise direction after returning from the arms A and B. Here the only role of the arms A and B is to provide a polarization dependent delay, while mixing of polarizations depends solely on the elements located inside the loop. By controlling the elements inside the loop, a wide range of general 1D quantum walk dynamics is accessible -limited only by the capabilities of the available optical components.

A. Time evolution of pulses as a quantum walk
For the purposes of mathematical description, we use a formal mapping between a wave mechanical superposition of spatially or temporally separated optical pulses and a quantum mechanical superposition of states of a photon [59] representing the quantum walker, as employed in our previous works [18,19,31,43].
The state of a discrete-time quantum walker is described by |Ψ , a vector in the corresponding tensor product Hilbert space H = H c ⊗ H x . For a DTQW on a line, the position Hilbert space H x equals l 2 (Z), spanning all possible positions x associated with the basis vectors {|x | x ∈ Z}. The coin Hilbert space, H c , describes the internal degree of freedom. For a 1D walk a two-dimensional coin space is usually assumed, which facilitated the use of polarization for this purpose by a number of research groups (see e.g. [17,18,20,44]). In a Michelson geometry ( Fig. 2 (b)) the walker can additionally be in a superposition of the two traveling directions in the loop, resulting in a four-dimensional coin space for a 1D walk. We introduce four orthogonal basis states H c as {|cH , |cV , |ccH , |ccV } representing the four orthogonal modes introduced earlier.
The unitary evolution of a DTQW is determined by the coin operatorĈ acting on the internal degree of freedom, followed by the step operatorŜ, which performs a conditional shift in the position x; together we write |Ψ t+1 =ŜĈ |Ψ t . In the convention defined by the experimental setup ( Fig. 2 (b)),Ŝ shifts the basis states |cH and |ccV (|ccH and |cV ) one position to the left (right), which corresponds to earlier (later) arrival times, and simultaneously reverses the traveling direction; in quantum walk terminology such conditional shift combined with a reverse in direction is commonly referred to as a flip-flop step operator. Formally, the operator can be expressed aŝ (2) Since the position space is still one dimensional but two different coin states indicate a step to the left and two to the right, the structure of the walk can be visualized as a line graph with doubled edges as illustrated in Fig. 2 (d).
The coin matrix describes the combined action of three 2 × 2 polarization rotations defined by the three operations C L , C A and C B in the loop and the two arms, respectively. Note that the elements in the arms A and B are passed twice by each pulse entering the respective arm; by C A and C B we describe the full rotation accumulated by the time it re-enters the loop. To realize the desired polarization rotations, we use quarter-wave plates (QWPs), half-wave plates (HWPs) and EOMs. In the polarization basis {|H , |V } the waveplates aligned at an angle α are characterized by the matrices and C HWP (α) = cos 2α sin 2α sin 2α − cos 2α , respectively. The EOMs are aligned such that they are described by matrices with the phase ϕ depending on the voltage applied to the particular EOM during a particular time bin [31]. In all cases involving dynamic EOM switches we always align the quarter-and half-wave plates at α = 45 • , so that the matrices (3) and (4) commute with (5) and it is inconsequential in which order a pulse encounters them. Since the elements in the loop do not mix counter propagating pulses, their effect can be described in the basis {|cH , |cV , |ccH , |ccV } by the block diagonal matrix Due to the action of the PBS, the optical elements in the arms A and B mix pulses from different traveling directions, so that the total operation corresponds to the matrix transforming e.g. |cH into a superposition of |cH and |ccV . The coin matrix of the quantum walk arises as the product of these two matrices When the polarization rotations are static in time, we can express the full coin operator asĈ = C ⊗ 1 x . However, due to the unique relation between time bins and position and step number of the walker, we can program specific phase shifts ϕ t,x,A , ϕ t,x,B and ϕ t,x,L to be realized for each time bin x by the three EOMs, thus making the coin operator position and time dependent, formulated asĈ t = x C t,x ⊗ |x x|. In this work we will only make use of the position dependence of the coin, keeping the same operations for each step t.
B. Set of directly accessible coins, achieving universality with a three-step protocol The product (8) covers a useful subset of U (4), as demonstrated by the experiments described in the following sections. Suppose, we intend to realize a certain coin operator, how to tell if this target coin can be decomposed into this form? It turns out that there is a particularly simple condition, requiring the pairwise linear independence of two appropriately chosen pair of vectors formed from the elements of the 4 × 4 coin matrix. We have included the proof in A 1. The proof is constructive in the sense that it shows how to efficiently find a decomposition Eq. (8) for a particular coin C, provided it exists.
A larger class of coins can be covered by using operators C LL without the restriction that they act the same on c and cc-propagating pulses. This can be achieved e.g. by altering our setup such that counter-propagating pulses reach the loop EOM with a sufficient time difference, allowing the programming of different rotations.
The full U (4) can be recovered by employing a multistep protocol [34,45,46] consisting of three steps. A crucial fact that the protocol uses is that any fourdimensional unitary matrix can be written as a product of two matrices each of the form C AB C LL -this we prove rigorously in A 2. Leveraging on the flip-flop nature ofŜ, namely that two successive applications of the step operator cancel each otherŜ ·Ŝ =1, we consider a sequence of coins C 1 , 1 and C 2 leading to an overall transformation described byŜĈ 2Ĉ1 . Appendix A 2 contains additional details of these arguments. We note, that besides static coins of the formĈ = C ⊗ 1 x , the protocol is applicable also to position and time dependent distributions.

C. Dynamical features of four-dimensional coins
Calculating analytically the evolution of a quantum walker over many steps is generally a demanding task. However, for translationally invariant systems it is possible to characterize the long time asymptotic dynamics in a simple way by analyzing the dispersion relation, i.e. the k-dependent quasi-energies ω(k) of the unitary evolution operator obtained after performing the spatial Fourier transform [60,61]. By locating all local extrema of the group velocities defined as the derivative v g (k) = dω(k)/dk we can determine the number and propagation speeds of wavefronts emerging from an initially localized state. In the case of a standard 1D quantum walk with a two-dimensional coin, this analysis yields the well-known double-lobed position distribution (see e.g. Fig. 4), with the two wavefronts moving away from the origin at speeds equal to the absolute value of the diagonal elements of the coin matrix, i.e. ±1/ √ 2 for the Hadamard walk. While split-step walks exhibit a richer dynamics in many respects, their asymptotic dynamics is still characterised by a double-lobed distribu- tion owing to the similarity of their dispersion relation with the standard 1D walk (see A 3 for details). DTQWs with higher dimensional coins, however, have been shown to feature additional ballistically propagating or trapped wavefronts [62]. We would like to note that the simple analysis of the dispersion relation cannot account for the effect of the initial coin state, which generally influences the relative intensities of the ballistic wavefronts, and neither does it provide a characteristic time after which the asymptotic dynamics is guaranteed to set in.
We have found that four-dimensional coins can give rise up to eight wavefronts in the position distribution of the walker, see A 4 for additional remarks. When the coin operators C A , C B and C L are restricted to quarterand half-wave plates the symmetries of the system permit degeneracies allowing crossings between different quasienergy branches, as illustrated on Fig. 3a. Under these restrictions, we observe a behavior similar to a standard 1D walk, exhibiting the usual double-lobe distribution. By considering coin operators built up from several waveplates, we can lift these degeneracies and turn the level crossings between the branches of quasi-energies into avoided crossings (see Fig. 3b). The level repulsion introduces additional bends and thus additional inflection points to the dispersion curves. The new inflection points can give rise to additional wavefronts associated with each distinct propagation velocities. The particular propagation velocities can be controlled by the appropriate choice of the coin operator. With the correct choice, the different propagation speeds may be discerned even within a limited number of steps of the QW evolution.

IV. EXPERIMENTAL RESULTS
The experimental results we present in this section can be divided into two groups. The experiments reported in Secs. IV A and IV B explore the translationally invariant dynamics of a quantum walker with a four-dimensional coin operator, using a (static) WP to implement the coin operators C LL and C AB . In Secs. IV C and IV D we study dynamics on finite cyclic graphs of various topologies, using three dynamically controlled EOMs as shown on Fig. 1.

A. Hadamard walk
To demonstrate the coherence properties of the setup in a simple manner, we present dynamics equivalent to a conventional DTQW on a line with a two-dimensional coin. By appropriate choices of the initial state and the waveplate parameters we restrict the dynamics to an invariant subspace corresponding to the coin states |ccH and |ccV , representing a conventional DTQW in the cc travelling direction of light pulses. In particular, we set C L to a Hadamard operation, realized up to a global phase by a HWP at α = 22.5 • , yielding The polarization rotations C A and C B in the arms are set to a polarization swap by introducing QWPs at the angle α = 45 • which are passed twice. Thus, up to an irrelevant global −i phase, the corresponding coin operator is This maps |ccH to |cV and |ccV to |cH , and the subsequent step operator (2) brings the traveling direction back to cc. Due to the absence of mixing of traveling directions, the pulses only ever travel in the loop in the counter-clockwise direction, in which the walk was initiated.
The system reduces to a quantum walk with a twodimensional coin, and can be described by the effective step and coin operators aŝ where we use |R and |L to follow the conventional notation for the right and left shifted components, respectively. The abstract states |R and |L correspond to |ccH and |ccV in the experiment. As a figure of merit we use the polarization resolved similarity between experimental and numerical probabilities defined as for the relevant positions x and the coin states d at a certain step t. We also make use of average similarity (over T steps) defined asS = 1 T T t=1 S(t). The measured standard Hadamard walk over T = 25 steps exhibits similarity ofS = 91.2 % to the theoretical expectation ( Fig. 4), demonstrating the outstanding coherence properties within the polarization degree of freedom of each propagation direction.
To test the robustness of coherence between the two propagation directions additional measurements were performed, where we have implemented dynamics alternating between the |cc and |c associated subspaces at every DTQW iteration (see B 1 for the details). The obtained similarity ofS = 93.1 % between numerical and experimental data confirmed the high overall coherence properties of the setup indicating a good basis for implementing more advanced quantum walk dynamics.

B. Walk with a genuine four-dimensional coin
In this section we report on a dynamical feature genuine to four-dimensional coins. As pointed out in Sec. III C, such coins can give rise to multiple wavefronts in the position distribution of the quantum walker, i.e. after tracing out for the coin degrees of freedom. Since dispersion analysis providing the wavefront structure and dynamics is accurate only in the long-time limit, the DTQW parameters has to be chosen carefully to allow sufficient resolution of the peaks within the experimental time-scale. Our strategy is to tune the coin parameters such that the we obtain the largest difference in propagation speeds.
We consider the four-dimensional coin operator implemented by placing a HWP in the loop, corresponding to C L = C HWP (20 • ), and two quarter-wave plates in each arm, aligned at 27 • and 0 • , respectively, corresponding . The dispersion spectrum of the DTQW with these parameters is depicted in Fig. 3b, where we can clearly resolve the effect of level repulsions. With this choice we have reduced the number of distinct propagation velocities from eight to four (±0.1655 and ±0.5538), due to degeneracies. While experimentally only 18 steps are reachable, we have numerically calculated evolution of the intensities for 50 steps to compare to the results of the asymptotic analysis. The results of this calculation are presented on Fig. 5, with solid green lines indicating the peak positions given by the asymptotic analysis. While the two faster peaks separate quickly within the experimentally achievable domain (indicated by a horizontal dashed line), the two slower peaks separate only after about 35 steps. Therefore, we can expect to be able to resolve three peaks in the experimental data: the two outer peaks corresponding to the faster wavefronts, and a single peak in the middle resulting from the transient overlap and interference between the two slower wavefronts.
The experimental results for the complete evolution are depicted in Fig. 6a. We can observe that the propagation velocities of the two outer peaks closely match the asymptotically expected values. To offer a direct comparison of the numerical and experimental data we present the respective probability distributions after 15 steps as a bar chart plot in Fig. 6b. In addition to the numerical distribution, we have indicated the peak positions yielded by the asymptotic analysis by vertical arrows. The positions and intensities of the outer peaks appear to be robust to the unavoidable experimental imperfections (affecting both the evolution and the initial state). The central peak structure shows greater sensitivity: while the numerical results display a more dominant right wavefront, in the experiment the left propagating one appears to be dominating.
While the overlap and interference prevent the resolution of the positions and intensities of the two inner peaks, the presence of more than two wavefronts proves the realization of a DTQW with a genuine fourdimensional coin.

C. Quantum walks on circles
With the large degree of coherence provided by the Michelson loop for static coins confirmed, we focus on harnessing the possibilities offered by dynamically con- Intensity distribution in step 11 of the experimental (red) and numerical (blue) data for almost ideal mixing (polarization is traced out). Lower panel: Similarity to the flat distribution on the relevant positions 1, 3, 5, 7 of the experimental (red dots) and numerical (blue dots) data, plotted versus the roundtrip number. In both cases the deviation of the experimental data from the numeric can be explained through imperfect switchings at the boundaries such that a small part of the intensity leaves the circle sites. Since we present the data without renormalization over the circle sites only, but take the "lost" intensity into account, the walker's overall intensity over the circle positions only is less than 1. For original data and description of the error bars see Fig. 14  trol of all three EOMs shown on Fig. 1. We have developed a scalable technique to use the additional control and the higher dimensional coin space to efficiently realize DTQWs on cyclic graphs of various topologies.
A circle graph, while locally appearing as one dimensional, requires the 2D Euclidean space to be embedded into. However, DTQWs on circles can still be implemented using well-chosen dynamics on a 1D line graph, either by exploiting the bipartite structure [46], or as we explain below, using additional coin degrees of freedom. Separating the two halves of the coin space based on the propagation directions enables the implementation of DTQWs on two parallel 1D lines (see Sec. IV A). By pairwise connecting the ends of these lines at appropriately chosen positions with the help of controlled operations the 1D positions can be mapped to the upper and lower arcs of a circle. The general and scalable technique can be used to implement DTQW dynamics on circles with position dependent coin operators and arbitrary sizes.
In the experiments reported here, the DTQW dynamics on circles satisfy periodic boundary conditions. However, with the mapping to an underlying periodic spatial structure more general, twisted boundary conditions can be realized as well [63].
The technique to realize circles involves position dependent coins which we have implemented by three fastswitching EOMs, each realizing a specific polarization rotation according to Eq. (5). EOMs are placed in the loop and each of the two arms, along with the static waveplates. In Fig. 7a we demonstrate how a circle can be formed in the graph of Fig. 2 (d)  We can describe this walk using a two-dimensional coin and a step operator as in Eq. (11), but with an additional periodic boundary condition |m ≡ |m + 2N .
We have measured the results of applying both mixing and non-mixing operations on the circle. Note that instead of the conventional Hadamard operation as given in Eq. (11) we here use another balanced matrix with different complex phases, This is because the coin matrix in Eq. (11) cannot be directly realized by an EOM, which we need for the position dependence. Note that this gives the same 50:50 splitting and as such we refer to (13) as Hadamard-like coin. For the different settings and the associated physical implementation see table I. In Fig. 7 we plot the intensity evolution of the walk on an 10-node circle for both the non-mixing and the H operation, for which we need to employ all three EOMs for the dynamic switchings, discriminating between inner and boundary positions of the graph. We plot only over the relevant positions m = 0, . . . , 9 and find a high agreement between experiment and numerics. A characteristic effect observable in walks on certain circle graphs is the so-called equidistribution or equalization, meaning that the probability distribution corresponding to the wave function becomes close to uniform. In an earlier work a similar effect has been studied in QWs on the line [64], where the term mixing was used. We, however, find it more appropriate to reserve the use of the term mixing for a property that arises as a time average [53,65,66], acknowledging that unitary processes generally do not converge to a stationary distribution. We analyze the equidistribution in detail in Fig. 8, where we present the intensity histogram for roundtrip 11 in which the experimental data from an 8-node circle shows nearly equal intensity at all four occupied positions (see   14b for the complete evolution). We note that the equidistribution effect is not universal, and is exhibited only be circles of certain sizes, among which the 8-node circle is the largest [67].
In the second panel we track the similarity of the walker's probability distribution (summed over the coin degrees of freedom) to the uniform distribution, as the function of the roundtrip number. Similarity of position distributions is defined analogous to the similarity S defined Eq. (12), just with the d indices dropped. We can extract an equidistribution time of approximately 10-12 roundtrips in agreement with the numerical model. This equidistribution effect is likely linked to the perfect state revival after 24 steps for a 8-node circle [68]: an initially localized state goes through a uniform distribution at half of the period, along with some neighboring steps. An example of a smaller circle with 4 sites showing the revival of the initial state in 8 repetitions is presented in Sec. B 2. The technique used for implementing circle graphs is inherently scalable to realizing any circle with an even number of nodes, since the size is set by choosing the switching times of the EOMs, without needing any extra resources. We present results for circles of sizes 8 and 16 in B 2.

D. Walks on figure-eight graphs
The circle graphs presented in the previous section represent a significant advance, offering a basis for simulation of systems obeying periodic boundary conditions. Our setup, however, is capable of realizing graphs more complex than these rank-two regular graphs. As an example, we realize DTQWs on simple instance of a Husimi cactus graph having the shape of a figure-eight, depicted on Fig. 9a, both with non-mixing and Hadamard-like coin dynamics on the circle arcs. Due to the coupled loops, Husimi cactus graphs are studied in the context of polymer networks in solution in the field of chemical physics [69], and for the interplay of search probability and centrality of the marked vertex in quantum search algorithms [70].
To implement novel dynamics, the coin at the central rank-four node where the two circles are joined must be a genuine four-dimensional operator. Dynamics on the nodes of the circle are experimentally implemented by dynamically controlled elements analogously to that of the circles (listed in table I). In order to implement the additional links at position x = 0 (equivalently, m = 7) we perform in the non-mixing setting an extra switch with the EOMs in the arms by −90 • compensating the polarization swap by the static elements The results for both of the settings are presented in Fig. 9 (b) and (c), respectively. One can clearly observe the light reappearing at node 7 after one cycle around the right and the left half of the figure-eight. Again, the coherence properties ensure a high agreement of experimental and numerical data even on such a complex graph structure. This proves the versatility of the four-dimensional coin operation compared to its twodimensional counterpart for tailoring the ballistic spreads of a translation invariant system and the flexibility in designing non-trivial graphs. We note that the lengths of the left and right loops could have had been chosen arbitrarily, and the present choice was made such that we can observe interference between pulses within the experimentally attainable steps. Our work opens the route to experimental simulation of energy transport in biological structures, for example in the photosynthetic apparatus of the purple bacterium, which are modelled by coupled circular and figure-eight shaped light harvesting structures [71][72][73].

V. CONCLUSION AND OUTLOOK
Fully exploiting the potential of discrete time quantum walks requires a reliable and comprehensive implementation of higher dimensional coin operators. To realize DTQW dynamics with genuine four-dimensional coin operators we have developed a novel experimental platform based on a looped Michelson interferometer. We have carried out several experiments with increasing complexity, demonstrating the accuracy, stability and capacity of our setup to realize four-dimensional coins. We started from a conventional Hadamard walk on the line, which shows high coherence over many steps, but essentially uses only a two-dimensional subspace of the available coin operations. Next we presented a coin implementation that exhibits multiple distinct propagation wavefronts, precluding any description as a walk with an effectively two-dimensional coin. Based on the four-dimensional coin degree of freedom, we developed a scheme to realize quantum walks on circles with programmable sizes over many steps by dynamic coin operations, without resorting to experimentally costly multi-step schemes. Finally, we presented a QW on an example of a Husimi cactus graph, resembling a figure eight, that involved realizing periodic boundary conditions at both ends and a central node with a coin equivalent to having links to four neighbors. Realization of this structure required the simultaneous implementation of genuine four-dimensional and effectively two-dimensional coins during the evolution. The flexibility of the existing setup has been demonstrated by two experiments, with dynamics on the arcs of the figure eight corresponding to a non-mixing, and a Hadamard-like coin, respectively.
The experimental platform in principle allows the realization of DTQWs with arbitrary four-dimensional coin operators, limited only by the polarization rotating elements available to the experiment. We have proposed an explicit three-step protocol that makes use of the property that every 4 × 4 unitary can be expressed as a product of two coin operators from the class achievable in a single roundtrip. Therefore, any coin, such as the Grover and Fourier coins are achievable, reaching far beyond the capacities of previous experiments relying on multi-step protocols based on two-dimensional coins [34,45,46].
Losses of the optical signal at each round trip play a critical role in the applicability of the setup. Implementing deterministic incoupling and outcoupling instead of the partially reflecting mirror has been recently employed in the Mach-Zehnder geometry yielding nearly 40 steps [43], by reducing the roundtrip losses below 20 %. This approach applied to the present geometry would be necessary for more advanced applications requiring long evolution times and multiple walkers.
The present experiment makes use of three EOMs each with limited switching capabilities. If polarization components with sufficiently versatile dynamical control are available, as our theoretical results indicate above, arbitrary dynamically controlled 4 × 4 coin operators are reachable. Such technology would enable the implementation of DTQW on a line with effective threedimensional coins, such as lazy walks [47][48][49]74], and a simple quantum game [75]. Appropriate switching of coins would also enable the realization of DTQW dynamics with arbitrary twisted boundary conditions [63] by implementing the underlying periodic structures, or more generally quivers [76].
The setup could be extended to realize dynamics on 2D lattices by adding additional delay lengths, similarly to how it has been implemented in the Mach-Zehnder geometry [19,32]. The availability of arbitrary coin operators would allow for the first time the experimental study of magnetic walks [77] and the QSH topological phase of quantum walks [50]. In addition, studies with four-dimensional coins enable possible observation and applications of an Anderson transition on a 2D lattice, an effect that could not be observed in split-step walks [78], by providing a mechanism analogous to spin-orbit coupling [79]. Higher dimensional coined DTQWs in 2D lattices could be combined to implement wrapped geometries allowing the experimental study of search protocols requiring periodic boundary conditions [6,80,81], and dynamics on Möbius-strip like graphs [82]. Additionally, implementing DTQWs on other non-trivial graph structures involving distinguished nodes with several neighbors would provide a basis for search algorithms [70] and graph isomorphism testing [83].
The above examples rely on four dimensional coin operators providing a structure to the dynamics not attainable using lower dimensional coin space dynamics. Our platform provides the first instance of an extensible realization of a quantum walk with four-dimensional coin operators with precise dynamic control, paving the way to experimental implementations of many important applications relying on genuine higher dimensional coins. (A1) can be written in the form (see Eq. (8)) both have rank one (i.e., linearly dependent rows or columns).
The implication from (A2) to the latter property follows trivially from performing the matrix multiplication, but it's instructive to have an explicit expansion: Let us now treat the opposite implication, i.e., assume that the two matrices in (A4) are of unit rank, so their elements must be of the form for some α i , β i , γ i , δ i ∈ C. Any matrix formed by such elements can be written in the following product form, (A7) This is already close to the desired form (A2) but nothing so far guarantees that the two matrices forming the righthand side are also unitary and thus realizable by separate physical transforms. It is easy to show that if α 1 and α 2 , or γ 1 and γ 2 , were simultaneously zero, C would be singular. In all the other cases there is some freedom in decomposing the left-hand sides of (A6), so without loss of generality we can assume that |α 1 | 2 + |α 2 | 2 = |γ 1 | 2 + |γ 2 | 2 = 1.
The unitarity of C postulates that the norm of its first and last row must be equal to 1 and their scalar product must vanish. With the above assumption these equations take the forms |β 1 | 2 + |δ 1 | 2 = 1, This simply says that the matrix is unitary. In (A7) these four elements take positions of the elements a ij of (A2). Similarly from the middle two rows of (A7) we derive the unitarity of or (b ij ). We also require that the first and second column of C are normalized and orthogonal vectors. Note that the unitarity of (A9) and (A10) also implies |β 1 | 2 +|β 2 | 2 = |β 3 | 2 +|β 4 | 2 = |δ 1 | 2 +|δ 2 | 2 = |δ 3 | 2 +|δ 4 | 2 = 1. (A11) Using the last equality, the row orthonormality condition gives the following equations, which again are nothing else than the conditions on unitarity of forming the blocks of the latter matrix in (A7).
In conclusion, the conditions stated by the theorem only allow matrices exactly of the form (A2) where the submatrices corresponding to C A , C B , C L are all unitary. Their elements can easily be reconstructed using the following algorithm: 1. Build the matrices (A4) and take any decomposition of the form (A6), the existence of which is guaranteed by the assumptions.

Note:
The same derivation can be repeated with minimal changes when the two diagonal blocks of the latter matrix of (A2) are not required to be equal, only unitary (as would correspond to transforming the clockwiseand counter-clockwise-propagating pulses in the loop independently). The two matrices in (A4) then need to be replaced by four matrices (A14)

Universality of available coins
The three-step protocol consists of Step 1: apply a coin C 1 , evolve over one round trip, Step 2: let the wave packets finish one full round trip with a trivial coin, Step 3: apply another coin C 2 , finish the round trip.
Here we rigorously prove that any U (4) is experimentally attainable using this protocol. Firstly, we point out the flip-flop nature of the step operator: two applications thereof amount to the identity map. So if the coin is left trivial (C A = C B = C L = 1) in Step 2, the application ofŜ in Step 2 negates any displacement made in Step 1 and returns the internal state to what it was immediately after the application of C in Step 1. The state after 3 steps can be described as thus the internal state is effectively transformed by the product C 2 C 1 and subject to just one flip-flop displacement, according to the final coin state. The total action of these three round trips thus can be perceived as a single step of a quantum walk with a more general coin.
This coin becomes indeed completely general, if we allow the blocks of C LL (Eq. (6)) to be controlled separately for the c and cc polarizations (upper-left and lowerright blocks): Theorem 2: Let C be a generic U (4) matrix. Then two transforms of the form can be found, C 1 , C 2 , such that C = C 2 C 1 , with the individual submatrices a, b, l, l in U (2). Moreover, up to a global phase correction factor, the four submatrices can be sought in SU (2). We will prove this theorem constructively, using braket notation on C 2 : let in this section ket denote a twoelement column vector and a bra with the same symbol its conjugate, a row vector composed of complex conjugate elements. Namely, we pair the unknowns of the decomposition in the following objects: (A17) The condition on unitarity of C LL then translates into the requirement that (|p , |q ), (|r , |s ), (|P , |Q ), and (|R , |S ) are four (not necessarily different) orthonormal bases.
We will show that the decomposition stated by Theorem 2 exists even with a further restriction i.e., the C A , C B matrices in Step 2 being trivial. In the following a ij and b ij will thus denote a ij,1 , b ij,1 for brevity.
If we split the required coin matrix C into 2 × 2 blocks as the equation can be expanded block-wise and written as a system of four separate block equations, We are also given the unitarity conditions of C: In the block form (A19), the former becomes and the latter (A24) Plugging in (A21), we find that if such decomposition exists, it must satisfy Along with the orthonormality of (|p , |q ) etc., the equations (A21) strongly resemble singular value decompositions (SVDs): indeed, they would become SVDs of the left-hand side matrices if, furthermore, the a ij and b ij coefficients were real and nonnegative. Without loss of generality, we can thus postulate that the first line is the actual SVD, i.e., that |p and |q are left-singular vectors, |P and |Q right-singular vectors and a HH and b V V the singular values of C T L , and see if we can satisfy the other three lines with this choice.
Given |P and |Q , we can apply both sides of the third line of (A21) on them, obtaining If the magnitude of at least one of the coefficients a V H or b HV is known to be nonzero (that is, per (A25), unless the singular values of C T L were both 1), the corresponding |s or |r is determined up to a complex phase. If both are, they are guaranteed to be orthonormal by (A24) and (A25). If a V H or b V H is zero, we complement |s as an orthonormal partner of |r or vice versa, respectively, with an arbitrary phase. In either case, the choice of the phase of the two vectors leaves a V H and b HV completely determined. The case a V H = b HV = 0 will be handled separately near the end of the proof.
Taking the Hermitian conjugate of the second equation of (A21) we find the vectors |r and |s and the numbers a HV , b V H in a complete analogy to the above, leaving the same exceptional case.
After these steps, the last equation does not contain any undetermined vectors, so we need to prove that it is not a contradiction.
Assume a HH < 1. Then both a V H and a HV are nonzero and |s and |S satisfy We can then study where from step to step we used (A27), (A24) (lower-left block), (A21) (first line), and (A21) (third line). This shows that C BR indeed maps |S to a multiple of |s , as (A21) requires, but also gives a concrete value to a V V and shows, along with (A25), that a ij together form a U (2) matrix.
If a HH = 1 (but b V V < 1), we can't use (A27), but we have We can still prove that C BR acting on |S produces some vector orthogonal to |r , which in turn is a multiple of |s : for this, we consider The last step follows from the fact that for a HH equal to 1, a HV = 0 and thus C T R |S = 0. We don't learn the phase of a V V , as it depends on the arbitrary phases of both |S and |s , but with a HV = a V H = 0 the a ij matrix is unitary for any choice. The emergence of the last term of the last equation of (A21) and the value of b HH are handled similarly, resulting in b ij being unitary and the system (A21) being consistent with our solution. Since the properties of |p , |q , . . . , |R , |S also guarantee unitarity of l ij,1 , l ij,1 , l ij,2 , and l ij,2 , this completes the decomposition.
We left out only one special case, a HH = b V V = 1. But this case is trivial: now C T L is of the form and so is unitary, C T R and C BL are zero, and C BR is unitary again. Matrices of this block form can be realized in a single round trip; if necessary, a three-step protocol can be made trivially by taking C 1 = C, C 2 = 1. This last remaining case finishes the main part of the proof.
Restricting the a, b, l, l submatrices to be special unitary is easy by the degrees of freedom encountered throughout the construction above. We will first consider the generic case where the off-diagonal blocks of C are nonzero.
We can follow closely the same algorithm as above but in the beginning, instead of using the SVD of C T L directly, we fix the phases of the basis vectors so that the matrices l ij,2 = (|p |q ) and l ij,1 = (|R |S ) † become unimodular. This in general changes the complex phases of a V H and of b HV . In the next steps we also choose the new base pairs so that they form matrices of determinant 1.
This only leaves the choice of balancing the phase between the two vectors in each of the four pairs. For example, multiplying |p by e iϕ and |q by e −iϕ leaves (l ij,2 ) unimodular and becomes a no-operation if compensated by simultaneously multiplying a HH and a HV by e −iϕ and b V H and b V V by e iϕ . But this amounts to a phase change in one row of the matrix a ij and the opposite phase change in one row of b ij . In such a transform, all the matrices keep their determinants except the latter two, whose determinants are modified by mutually opposite phases. At a certain phase the determinants become equal, and the common phase of the two matrices can be factored out of the decomposition as a unphysical complex prefactor.
In the special case we find angles α, β such that Then which corresponds to choosing l HH,1 l HV,1 l V H,1 l V V,1 = e −iα C T L , l HH,1 l HV,1 l HH,2 l HV,2 l V H,2 l V V,2 = 1, all of which are unimodular, as required.

Split-step walks feature only two wavefronts
In this section we show that split-step walks are limited to exhibit two counter-propagating wavefronts. The analysis is based on the dispersion relation of the quantum walk operator U . The relation can be obtained for translation invariant DTQW considering the Fourier transform U (k) of the walk operator, and calculating the eigenvalue spectrum λ j (k) = e iωj (k) for each k ∈ [−π, π). The group velocity defined as v g (k) = ω (k) = dω(k)/dk plays an important role in determining the propagation speeds of the wavefronts [60]. The wavefront velocities are given by the set {v g (k) | k ∈ [−π, π) s.t. ω (k) = 0}, therefore, the number of distinct wavefronts strongly depends on the number of solutions to the equation Split-step walks are defined by a walk operator of the form where S + and S − respectively shift the |R component to the right, and the |L component to the left, while leaving the other component unchanged. The two coin operators can be taken to be SU (2) matrices, thus described by pairs of complex parameters satisfying |u 1 | 2 + |v 1 | 2 = |u 2 | 2 +|v 2 | 2 = 1, as usual. The eigenvalues of the operator in Eq. (A37) obey the equation cos ω ± (k) = ± cos(k + ϕ)ũ +ṽ, where ϕ = arg(u 1 u 2 ),ũ = |u 1 u 2 | andṽ = (v 1 v * 2 ). Formally solving Eq. (A36) yields cos(k + ϕ) = a ± a 2 − 1, with a = (ũ 2 +ṽ 2 − 1)/2ũṽ being a real number. To obtain a real solution for k, the right hand side of the equation must be real. This is fulfilled only if |a| ≥ 1 holds. However, the right hand side must also be between −1 and 1. Therefore, for negative a the only valid equation is cos(k + ϕ) = a + √ a 2 − 1, while for positive a it's cos(k + ϕ) = a − √ a 2 − 1. This still yields two solutions for k due to the cosine function being even. However, as we shall see, this symmetry does not yield any additional wavefronts. Indeed, the group velocities for the two bands ω ± are given by the equation thus while formally there are four solutions, they are pairwise degenerate since the sine function is odd.

Wavefronts with four-dimensional coins
As mentioned above, calculations of DTQWs with experimentally relevant four-dimensional coin operators, given by Eq. (8), reveals that these walks may exhibit up to eight wavefronts (i.e. four in each direction) in their position distribution. Here we supply some remarks on how to interpret the band structure plots and the conclusions drawn from them.
The analysis of the dispersion spectrum, analogously to the case of split-step walks in Sec. A 3, can be greatly simplified by observing that the symmetries of the evolution operator guarantee that each branch is related to each other by combinations of displacements or reflections. Since these transformations do not affect the number of inflection points and the absolute value of the associated group velocities, a single branch can provide information about the entire structure. Due to the vertical reflection relation, wavefronts will always appear in counter-propagating pairs, each with the same speed.
We emphasize, that the dispersion analysis merely provides the velocities of the wavefronts, and does not provide information about their characteristic widths or relative intensities, partly because these parameters depend strongly on the initial coin state. To be able to experimentally resolve two wavefronts, the difference in their velocities must be large enough so that their relative distances exceed their characteristic widths within the experimental time frame. Here we describe an experiment testing the coherence properties of statically coupling the traveling directions in a situation where the results have a clear intuitive interpretation. We have set the QWP implementing opera- tor C A to swap the polarizations, resulting in the preservation of the traveling directions as explained earlier.
The operation C B in the other arm is implemented by a QWP at 0 • , thus reversing the traveling directions in the subsequent step operation (Eq. (2)). The situation is described by the coin operation (B1) The coin operator C LL in the loop is set to a Hadamard operation by using a HWP at 22.5 • as in Eq. (9), having no effect on the travelling direction. The obtained results are presented in Fig. 10, along with numerical simulations for reference. The strong similarity of 93.1 % between simulation and experiment proves excellent coherence properties even when both traveling directions are involved.

Circles with 4 and 16 sites
In this section we present data from experiments realizing walks on circles. We recall that the inner and boundary positions are distinguished by employing all three EOMs to perform the appropriate coin operators.
In our first example we realize dynamics on a circle of size 16. For this graph the boundaries are implemented at x = ±4. With a non-mixing coin, the input polarization |D = 1 √ 2 (|H + |V ) in the cc direction initiates two counter-propagating localized components in the walker's wave function. In Fig. 12 we present the experimental and numerical data showing how the walker is initially in the cc subspace corresponding to the upper semicircle (see Fig. 11 for the convention). In the fourth step it is transferred to the lower semicircle and the two components meet again at x = 0 corresponding to m = 8. The high extinction of the intensity at the ideally unoccupied positions witnesses the quality of switchings at the inner and the boundary positions. Results for the mixing H operation (cf. Eq. (13)) are presented in Fig. 13, displaying similar features, however, with visible effects of dispersion and slower propagation speeds on the propagating waves. Note that we applied a different plotting convention here and just display the relevant positions m of the circle, see Fig. 11.
The equidistribution effect can be observed in the re-sults presented in Fig. 14c for the dynamics on a circle of 8 sites between steps 10 through 14. Note that even (odd) positions are unreachable in odd (even) step numbers with a localized initial state, thus it is understood that the distribution is uniform over the set of positions that are allowed by the dynamics. Implementation of a smaller circle of size 4, in which the boundary positions are at x = 0 and x = 1, involves a significantly higher number of EOM switchings resulting in higher overall error, but still a similarity of more than 80%. The observed dynamics is presented in Fig. 15. In this case an initially localized state also goes through phases of equidistribution and revival. The period of revival is 8 [67,68], but contrary to the previous case the equidistribution does not happen at one half of that time, but earlier at steps 2 and 3. At step 4 we instead observe a phenomenon where the probability distribution of the initial state reappears, but shifted to a node opposite the starting node. The equidistribution time, then, is one half of the first time of occurrence of this "shifted revival".

Error discussion
In this section we describe the method used for determining the extent of the error bars in Figs. 6 and 8.
The measurements of intensity distributions are subjected to inhomogeneities in the detection efficiencies for each of the four internal basis state and inaccuracies in the angles of the statically and dynamically implemented coins. Assuming errors of the detection efficiencies of ±2.5 % and of the coin angles of ±1 • , we conduct a Monte Carlo simulation in which we randomly generate 1000 different settings for these quantities within the assumed error range. For each of these settings we calculate the deviation of the resulting numeric intensity distribution from a reference intensity distribution. This reference is obtained when running the numerical simulation with the fit parameters allowing for the closest approximation of the experimental results. The error for the individual positions and polarizations is then calculated as the standard deviation of the randomly generated samples from the reference distribution.
The errors of the similarity to an equidistribution are determined via error propagation from the errors of the intensities, resulting in the error bars visible in Fig. 8.