Experimental superposition of a quantum evolution with its time reverse

In the macroscopic world, time is intrinsically asymmetric, flowing in a specific direction, from past to future. However, the same is not necessarily true for quantum systems, as some quantum processes produce valid quantum evolutions under time reversal. Supposing that such processes can be probed in both time directions, we can also consider quantum processes probed in a coherent superposition of forwards and backwards time directions. This yields a broader class of quantum processes than the ones considered so far in the literature, including those with indefinite causal order. In this work, we demonstrate for the first time an operation belonging to this new class: the quantum time flip. Using a photonic realisation of this operation, we apply it to a game formulated as a discrimination task between two sets of operators. This game not only serves as a witness of an indefinite time direction, but also allows for a computational advantage over strategies using a fixed time direction, and even those with an indefinite causal order.


INTRODUCTION
In recent years, the framework of quantum theory has been generalised to describe agents interacting through quantum processes with indefinite causal orders [1][2][3].These processes have been realised experimentally using photonic platforms [4][5][6][7][8], thereby witnessing the implementation of causally non-separable series of events.Remarkably, these are not the most general processes allowed by quantum mechanics.Take, for example, the quantum SWITCH [2]: even though the causal order of the constituent events is indefinite, each operation is accessed only in a single time direction.By considering processes where the time direction of the underlying operations is indefinite, one can go beyond the framework of indefinite causality.Indeed, a quantum superposition of evolutions with opposite thermodynamic arrows of time was first proposed in [9].
Processes with an indefinite time direction can be studied by considering operations that exhibit a time symmetry; these operations admit a change of reference frame that yields a valid quantum evolution in which the time coordinate is inverted.Unitary channels are an example of such operations, and in particular they admit the following time-reversal symmetries: for every evolution U, both the inverse U → U −1 and the transpose U → U T are valid time-reversal operations.The presence of such a symmetry naturally excludes evolutions with an arrow of time, such as the thermodynamic processes studied in Ref. [9] Given quantum operations that can in principle be accessed in both time directions, we can consider coherent superpositions of transformations made in the forwards and backwards time-directions.This amounts to a new kind of process, which we will refer to as being inseparable in its time direction, an example of which -called the quantum time flip -was recently introduced in Ref. [10].This process cannot be realised within the quantum circuit model.In this work we nevertheless present a photonic implementation of the quantum time flip by exploiting device dependent symmetries of our experimental apparatus.A quantum state undergoing a time evolution is encoded in the polarization degree of freedom of a single photon, while a control qubit determining the time direction is encoded in its path degree of freedom.We show that polarization operations with waveplates naturally implement different time directions for forwards and backwards propagation directions through the waveplates, given the correct Stokes-parameter convention.This results in a deterministic time-reversal, in contrast to more general approaches which may involve multiple uses of the input operation in combination with probabilistic or non-exact methods [11][12][13][14][15][16][17][18][19][20].We can furthermore realise the quantum time flip deterministically by passing the photon through the waveplates in a superpositon of the two propagation directions.
We certify the indefinite time direction by demonstrating an information-theoretic advantage of the quantum time flip in the context of a computational game.In this setting, the quantum time flip not only outperforms strategies that utilise operations with a FIG. 1. Time-reversal and the quantum time flip.(a) The forwards (top) and backwards (bottom) directions of the same timeevolution are shown in yellow and blue, respectively.The backwards time-evolution is given by some function f of the forwards evolution, and decomposing the total time evolution into steps shows that f must be order reversing.The inverse and transpose are examples of such order reversing functions.(b) Quantum gates are often modelled as black boxes with an input and an output.In this work, we consider black boxes that can be accessed in two different directions, producing either the forwards or backwards time-evolution depending on which direction the box is accessed in.Here, the backwards time-evolution is taken to be the transpose.(c) A control degree of freedom can be introduced to control in which direction the black box is accessed.(d) By putting the control qubit in a coherent superposition of the two states in (c) the box is accessed in a superposition of both directions, and the input state is propagated in a superposition of time directions.This is a realisation of the quantum time flip.(e) The quantum time flip can be applied to more than a single gate.This figure illustrates a scenario where two gates are accessed in a superposition of orders, in which they always have the opposite time directions.As described in the main text, this use of two quantum time flips can yield a computational advantage.fixed time direction, but even strategies that exploit operations with an indefinite causal order [4,21].

QUANTUM CIRCUITS, UNITARY TRANSPOSITION, AND PROCESSES WITH INDEFINITE TIME DIRECTION
The standard quantum circuit formalism provides solid grounds for quantum computing and forms the basis for quantum complexity theory [22,23].However, it also imposes limitations on how we apply quantum theory.In a circuit, operations necessarily respect a definite causal order and the strict notion of input and output.The existence of time reversal processes such as unitary transposition is forbidden by the standard circuit formalism when given access to one [15,16] or even two [10] uses of an unknown unitary.However, for practical and foundational reasons, researchers have been designing and pursuing non-exact and probabilistic schemes aimed towards this goal [11][12][13][14][15][16][17][18][19].Remarkably, a very recent work shows that in the qubit case, when four uses of the input operation are available, there exists a quantum circuit to invert arbitrary unitary operations [20].
In quantum theory, reversible operations are described by unitary operators.Processes which reverse a composition of such operations may be expressed by a function f satisfying for all unitary operators U and V (see Fig. 1).Under natural assumptions, it can be proven that, up to a unitary transformation, there are only two time reversal functions f , unitary transposition f (U) = U T and unitary inversion f (U) = U −1 [10].For twodimensional systems, unitary transposition and unitary inversion are unitarily equivalent via a Pauli σ Y operation.This follows from the identity, U −1 = σ Y U T σ Y which holds for all operators U ∈ SU (2).Hence, for qubits, universal unitary transposition is possible if and only if unitary inversion is possible.This equivalence does not hold for higher-dimensional systems, and in these cases the transpose is singled out as the only timereversal operator for which the quantum time flip is defined [10].Together with the fact that in the Choimatrix formalism the transpose has the interpretation of exchanging the roles of input and output of a channel, this motivates the choice of the transpose as timereversal operator for qubit systems.
When focusing on a particular physical implementation, the general aspects of the standard quantum circuit formalism may limit our view and lead to an apparent mismatch between theory and practice.A known illustrative example is the universal coherent control of unitary operations, where an arbitrary unitary U is applied to the target system conditional on the state of a control qubit: While it is not possible to design a quantum circuit to perform universal control, a simple Mach-Zehnder optical interferometer can be used for this task [24][25][26].Indeed, FIG. 2. Classes of game strategies.The figure depicts the different strategies for the game described in the main text and their corresponding maximum winning probabilities p.These maximum winning probabilities are obtained through an optimization over all possible choices of the resources shown in dark blue, and hold for pairs of unitaries (U, V) uniformly randomly picked from the sets M + and M − .The state ρ, for example, is allowed to contain any number of auxiliary degrees of freedom, and analogous statements hold for the measurement M, channel E and process W. The three strategies differ in how they are able to access the gates picked by the referee.The strategies (a)-(c) are shown here in the forwards time direction, but are also valid in the backwards time direction in which both gates are transposed.Each subsequent strategy is strictly better than the previous one, and only players who have access to a quantum time flip process can win the game with unity probability.experimental control of black box quantum gates has been demonstrated [27,28].Such experimental implementations exploit the knowledge of the position of the physical device performing the gate, circumventing this apparent limitation imposed by the quantum circuit formalism.
Although time reversal processes such as unitary transposition are not possible within the standard circuit formalism when given access to one [15,16] or even two [10] uses of an unknown unitary, in this work we implement general qubit unitary transposition, as well as the quantum time flip process, using a particular optical construction.Similar to the case of universal coherent control, we make use of knowledge about our specific experimental apparatus to realise a black box unitary that may be used in two different directions.As shown in Fig. 1.b, this box implements U in the 'forwards' direction, while in the 'backwards' direction it has the effect of the transposed operation U T .Moreover, in addition to "simply" reversing a quantum evolution, we also coherently superpose the forwards and backwards time evolutions, and in so doing perform an optical implementation of a process with an indefinite time direction [10], i.e. one which cannot be described as a convex mixture of processes in which each gate is accessed only in one time direction.The process that we implement optically is the quantum time flip for unitary transposition, a process which acts on unitary operations as We then compose the time flip process of Eq. ( 2) with its flipped version, V → V T ⊗ |0⟩⟨0| C + V ⊗ |1⟩⟨1| C , to ob- tain a process which acts on a pair of unitary operators as In addition to having an indefinite time direction, the process described in Eq. ( 3) cannot be described by general process matrices with indefinite causality such as the quantum switch [2] or the Oreshkov-Costa-Brukner (OCB) process [3].In the next section, we will explain how to witness this property.

GAME DESCRIPTION
We now describe a discrimination task, first introduced in Ref. [10], where the quantum time flip process will be used as a resource to increase our performance.In this game, a referee provides the player with two black box unitaries, U and V, belonging to either the set M + or M − , which are known to respect the property The player is then challenged to determine which of the two sets the gates were picked from, while only being allowed to access each of the black boxes once.
As discussed in the previous section, a player able to perform the quantum time flip may implement the process in Eq. (3).Consider as a strategy an initial state of the form |ψ⟩ T ⊗ |+⟩ C , where , |ψ⟩ is an arbitrary state, and the subscripts C and T refer to the control and target qubits.Sending this state through the gate in Eq. (3) gives the state (6) Since the states |±⟩ are orthogonal, a player using this strategy can always correctly determine which set was chosen by the referee.
In contrast, players who do not have access to indefinite time strategies may not be able to ascertain with certainty to which set a given pair of unitaries (U, V) belongs.In order to make this claim concrete, Ref. [10] considers a particular game where the set M + has 13 pairs of unitary operators respecting UV T = +U T V, and M − has 8 pairs of unitary operators respecting UV T = −U T V; these two sets of unitary operators are presented in Box 1.Here, we consider an average case variation of the aforementioned game, which goes as follows: with uniform probability p = 1 13+8 , the referee picks a pair of unitary operators (U, V) from M + or M − and lets the player make a single use of each.We then consider the optimal success probability of players who have access to different kinds of resources.As indicated by Eq. ( 6), players who have access to the quantum time flip can always win with unity probability.The three other classes of strategies, shown in Fig. 2, only have access to a single time direction, forwards or backwards, and convex combinations of these strategies will be called separable in their time direction; a detailed mathematical characterisation of these strategies is presented in the methods.Employing the computer-assisted proof methods of Ref. [29] we obtain upper bounds on the maximal success probabilities for players restricted to particular classes of strategies.The code for this is openly available in our online repository, see Methods for details.
The first alternative strategy we consider is one in which the player is restricted to using U and V in parallel, and this results in a maximal success probability that is bounded by 88 100 ≤p par ≤ 89 100 .Next, we consider players restricted to causally ordered strategies, whose maximal success probability is found to be bounded by 90  100 ≤ p causal ≤ 91 100 .Finally, players given access to process matrices with indefinite causality (also called indefinite testers [30]), but with definite time direction, have their maximal success probability bounded by 91  100 < p i.c.≤ 92 100 .Unlike the task in [31], in which causally ordered and general non-quantumcircuit-model strategies perform equally well, this game is hence an example of a channel discrimination task with strict hierarchy between four different classes of strategies.Additionally, while the operations selected by the referee are treated as being fully characterised in the above analysis, there are no assumptions made about the measurements performed by the player, and these can remain unknown.This is therefore an example of a semi-device-independent certification of an indefinite time direction [32,33].This stands in contrast to witness based approaches, previously used to certify advantages in channel discrimination tasks [5], in which one needs well characterised measurement devices in order to evaluate the witness operator.

EXPERIMENT
Our photonic implementation of the game described in the previous section makes use of the quantum timeflip strategy from Eq. ( 6) to achieve a success probability exceeding that of any strategy only using the gates in one time direction.To coherently apply the quantum time flip, we employ polarization optics in a partially common-path interferometer, depicted in Fig. 3, with the control and target qubits being encoded in the path and polarization degrees of freedom of a single-photon, respectively.Our experiment makes use of two quantum time flips, sequentially applied to the two unitaries V and U.The resulting controlled channel is the one of Eq. ( 6) where the gates UV T and U T V act on the target (polarization) qubit and are implemented using two Simon-Mukunda polarization gadgets consisting of three waveplates each [34], for which the transpose operation is obtained by reversing the propagation direction.
Such polarization gadgets generally do not realise the transpose operation in the backwards propagation direction, but rather a related operation: where P is a matrix describing the change of reference frame to the backwards direction, and the subscripts indicate the propagation direction.While it is possible to construct a gadget that implements the transpose by introducing time-reversal symmetry breaking elements [36], here we instead exploit the fact that the transpose is a basis-dependent operation.More concretely, by adopting the convention (S 1 , S 2 , S 3 ) ↔ (−Z, −Y, −X) for our Stokes parameters [37] we find that P = 1, and the polarization gadgets transform as the transpose under counterpropagation (see Methods).Superimposing two propagation directions through a gadget therefore allows us to implement the quantum time flip, with the photon path acting as a control degree of freedom.The specific coherent superposition of time flips in Eq. ( 3) is achieved through the use of fiber optic circulators.The optical circuit in Fig. 3 begins with a bulk beamsplitter that initializes the control qubit into the state ;6<2=<$->?0&@&A?0 !"#%$6+(9')$% B6<2=<$-C5&@&0C5 5D50E FIG. 3. Experimental apparatus.(a) A type-II spontaneous parametric down-conversion source generates frequency degenerate single-photon pairs at 1546 nm in a ppKTP crystal (top).The signal photon is sent to a heralding detector, while the idler photon is routed to a balanced bulk beam-splitter and coupled into single-mode ber.Blue (green) arrows indicate the photon path corresponding to the control-qubit state j 0⟩ C (j 1⟩ C ).A piezo-electric actuator attached to one of the ber couplers allows for control over the interferometric phase, pairs of HWPs / QWPs are used in combination with ber polarization controllers for polarization compensation through the bers.This use of redundant polarization elements both improves and simpli es the compensation procedure [35] After the initial beam-splitter two ber circulators guide the photon through the V-gadget.Propagating through the gadget in the `forwards' direction implements the unitary V, while propagating `backwards' has the effect of applying the transposed operation relative to the `forward' direction.One of the two paths through rst the gadget therefore results in the V T being applied instead of V. Two additional circulators then route the photon through a gadget implementing U (U T ) in `forwards' (`backwards' ) direction.Finally, the signal photon is sent to a ber beam-splitter, which applies a Hadamard gate on the path degree of freedom, and correlates the two spatial output modes with the sets M + , M − .Detection is performed by superconducting nanowire single-photon detectors (SNSPDs) housed in a 1 K cryostat.Additional QWP/HWP pairs are used to compensate ber-induced polarization rotations.(b) The ber circulators route the light from port 1 → 2, from 2 → 3 and block light entering in port 3. The bidirectional boxes in Fig. 1 are realised using sets of three waveplates.Depending on the propagation direction, they implement either the unitary operation U/V or U T /V T .
The optical circuit in Fig. 3 begins with a bulk beamsplitter that initializes the control qubit into the state ), after which two ber circulators guide the photons through the V gadget in two different directions, giving the joint control-target state Entering the circulators from a different port, the photons are then directed to the U gadget, which they once again propagate through in opposite directions, transforming the joint state to At the end of the optical circuit, a ber beam-splitter applies a Hadamard gate on the control qubit, giving the state A projective measurement on the control (path) qubit in the computational basis then reveals whether (U, V) A piezo-electric actuator attached to one of the fiber couplers allows for control over the interferometric phase, pairs of HWPs / QWPs are used in combination with fiber polarization controllers for polarization compensation through the fibers.This use of redundant polarization elements both improves and simplifies the compensation procedure [35] After the initial beam-splitter two fiber circulators guide the photon through the V-gadget.
Propagating through the gadget in the 'forwards' direction implements the unitary V, while propagating 'backwards' has the effect of applying the transposed operation relative to the 'forward' direction.One of the two paths through first the gadget therefore results in the V T being applied instead of V. Two additional circulators then route the photon through a gadget implementing U (U T ) in 'forwards' ('backwards') direction.Finally, the signal photon is sent to a fiber beam-splitter, which applies a Hadamard gate on the path degree of freedom, and correlates the two spatial output modes with the sets M + , M − .Detection is performed by superconducting nanowire single-photon detectors (SNSPDs) housed in a 1 K cryostat.Additional QWP/HWP pairs are used to compensate fiber-induced polarization rotations.(b) The fiber circulators route the light from port 1 → 2, from 2 → 3 and block light entering in port 3. The bidirectional boxes in Fig. 1 are realised using sets of three waveplates.Depending on the propagation direction, they implement either the unitary operation U/V or U T /V T .
, after which two fiber circulators guide the photons through the V gadget in two different directions, giving the joint control-target state Entering the circulators from a different port, the photons are then directed to the U gadget, which they once again propagate through in opposite directions, transforming the joint state to At the end of the optical circuit, a fiber beam-splitter applies a Hadamard gate on the control qubit, giving the state A projective measurement on the control (path) qubit in the computational basis then reveals whether (U, V) belong to M + or M − .The partially common-path structure of the interferometer has two distinct advantages: (1) photons in the two different propagation directions of the interferometer hit exactly the same spots on the waveplates and the physical symmetries of the gadget therefore ensures the faithful implementation of the time flip independently of any imperfections in the waveplates, (2) the paths traversed in both directions do not contribute any phase noise to the interferometer, thereby simplifying the phase stabilization.More specifically, only the paths connecting the two beam-splitters with the fiber circulators, as well as the fibers directly between the circulators, add phase noise to the interferometer.These fiber components, as well as the bulk beam-splitter at the interferometer input, are housed in a thermally and acoustically insulated box.The passive stabilization of these elements is sufficient to bring the phase drift down to a value of approximately 10 mrad min −1 .The use of a bulk-beamsplitter at the input was chosen in order to balance the losses induced by the fiber circulators through the free-space to fiber coupling, and to give control over the interferometer phase , through a piezoelectric actuator.This piezo was used to reset the phase of the interferometer prior to beginning the measurements, and was not employed for active feedback.The fiber beam-splitter at the output ensures perfect spatial mode overlap for high interferometric visibility.FIG. 4. Unitary transposition fidelity.The yellow and blue bars indicate the fidelity, F , of the unitaries U (top) and V (bottom) from the sets M + (left) and M − (right), measured in the forward propagation direction, with respect to the transpose of the reconstructed unitary measured in the backwards propagation direction.Taller bars indicate a higher fidelity between the unitaries in the two propagation directions.The average fidelity is 0.9992 ± 6.5 × 10 −4 indicating that the gadgets faithfully implement the transpose.The uncertainties were estimated using a Monte-Carlo simulation of the tomography accounting for errors in the waveplate angles, and the superimposed box plot indicates the spread of the reconstructed fidelities.We attribute the residual errors to imperfect waveplate retardance in the tomography, and angle differences between the setting of the forwards and backwards unitaries, since in principle the gadgets perfectly implement the transpose of the unitary in the forward direction.

RESULTS
Before demonstrating the quantum time flip in the context of the game, we first verified the ability of a polarization gadget to implement both a unitary and its transpose simultaneously, in the two different propagation directions of the light.To this end, we performed quantum process tomography on the implemented unitaries from the sets M + and M − , in both propagation directions.We then compared the fidelity |Ψ⟩ between the reconstructed unitaries in the forward direction, U fw and V fw , with the transposed reconstructed unitaries in the backwards direction, U T bw and V T bw (see Methods).The results of this are shown in Fig. 4. The average fidelity is greater than 0.999, indicating that the gadgets correctly implement the transpose.Note that the fidelity of the transpose is independent of any errors in the retardance of the waveplates in the gadget itself.Such imperfections would cause the fidelity in the implementation of a desired unitary to drop, but would affect the forward M I + = {(I, I), (I, X), (I, Z), (X, I),(X, X),(X, Z), (Z, I),(Z, X),(Z, Z)} and backward directions symmetrically.The same is true for undesired offsets in the waveplate angles, however in the measurements shown in Fig. 4 the unitaries in the two directions were measured in separate runs, causing them to indeed be sensitive to waveplate angle errors, in addition to errors in the tomography itself.
Having verified the ability to implement a given unitary and its transpose with a single black box simultaneously, we then realised the game discussed in the previous sections.First, two-photon coincidence events for the different elements of M + and M − were collected sequentially to reduce the time spent rotating the waveplates.Second, the game itself was played using the collected data.In each round the referee uniformly randomly selects a pair of channels, and the player outputs an answer, '+' or '−', given by a unique two-photon event from the corresponding measurement set. Figure 5 shows the relative frequencies f rel ±,k = N ± k /N k , where N ± k is the number of times the player output the answer '±' when the channels (U k , V k ) were picked, and N k is the total number of times these channels were selected by the referee.It can be seen that the player outputs the correct answer with a relative frequency higher than the indefinite tester bound of 0.92 for every setting, and by extension any strategy that is separable in its time direction.More specifically, the average winning frequency is found to be 0.9945, with the best and worst case frequencies being 0.9993 and 0.9860, respectively.
The formulation of the indefinite-time-direction witness as a game with only two outcomes, win or lose, allows for a straightforward statistical interpretation of the results.Since we have an upper bound p i.c.≤ 92 100 on the probability of success for an indefinite tester , we can calculate the probability P of such a player having obtained v or more victories in N rounds: This probability is exactly the P-value for the experimentally implemented process not being indefinite in its time direction.Out of the N = 10 6 rounds played in the experiment, v = 994, 512 were won by successfully identifying the correct set, while 5, 488 rounds were lost.Using a Chernoff bound tailored for the binomial distribution, we can provide an upper bound on the P-value, given by where exp is the exponential function and ) is the relative entropy.Direct calculation using p i.c.= 0.92 shows that D v N ||p i.c.≈ 0.0627, hence the P-value is upper bounded by P ≤ e −10 4 , which is an extremely small number.This rules out any explanation of the data in terms of convex mixtures of quantum processes that access the gates in a definite time direction.Since this is the defining characteristic for the class of processes with an indefinite time direction, we therefore conclude that the implemented process belongs to this class.

DISCUSSION
In this work we have demonstrated, for the first time, a process that is inseparable in its time direction.Using an optical interferometer, we implemented a coherent superposition of arbitrary unitary transformations and their time-reversal.Such a process can only be probabilistically simulated by a quantum circuit with a definite time direction.Even agents equipped with two copies of the gates and able to combine them in an indefinite order cannot realise the process deterministically, unless they are given the ability of pre-and postselecting quantum systems [2,[38][39][40][41][42].It is worth noting that our implementation of controlled unitary transposition is not in contradiction with the no-go theorem, stating that there is no quantum circuit that can transform an unknown quantum unitary gate to its transpose [10,15,43].Our implementation adopts a device that implements a single-qubit gate U, and while this gate can remain unknown, the physical device itself is neither arbitrary nor unknown.Indeed, it is the particular symmetries of the physical device that necessarily and deterministically generate the transposed gate U T .While time itself does not flow backwards in any part of the experimental apparatus, our demonstration highlights the limitations of the quantum circuit model for describing the full range of quantum information processing protocols.This is analogous to the impossibility of perfect unitary coherent control within the quantum circuit model [10,25,28,44].Through a channel discrimination game, in which we outperform any strategy with a definite time direction, we furthermore certify that the coherent superposition of time directions yields a process that is inseparable in its time direction.
The study of indefinite causality led to the discovery and realisation of quantum information protocols with practical advantages [45,46], as well as a lively debate about the interpretation of these realisations [47][48][49][50].We envision that future studies of processes with an indefinite time direction will similarly expand both the theoretical and experimental toolkit and open up new avenues for quantum information processing.Indeed, a recent work has shown that processes with an indefinite time direction can show enhanced performance in certain communication tasks [51], and the experimental methods presented here could be used to demonstrate such advantages.We note that an experimental demonstration of an indefinite time direction was also presented in a parallel and independent work [52].
In the context of future work we note that universal transposition of single-qubit gates is a sufficient building block for the transposition of multi-qubit gates, for instance using a Reck decomposition [53], or through the inclusion of a reciprocal symmetric twoqubit gate [54].We believe that the demonstration of coherent transposition of a two-qubit unitary using the former approach on a hyper-encoded two-qubit photonic state would be within experimental reach.Finally, the investigation of time reversed quantum processes also holds applications in quantum thermodynamics.Indeed, in [10], it was shown that the processes for which the quantum time flip produces another valid process are exactly those which can be written as linear combinations of unitary channels.That is, channels which do not decrease the entropy in either time direction.Nevertheless, the application of superpositions of two time directions in the context of thermodynamic work was recently studied in [9,55].Such superpositions could be realised using the quantum time flip by having it act on the unitary dynamics of the joint stateenvironment system.since: The transformation in Eq. ( 20) is not useful for realising the transpose, since the Z gates around the unitary U T G,fw have to be undone to recover the transpose.
However, this problem can be overcome by picking a different convention for the polarization basis states, such as (S 1 , S 2 , S 3 ) ↔ (X, Y, Z) which is a cyclic permutation of the aforementioned one (corresponding to a rotation of the basis vectors by π/3 around the vector 1 1 1 ), and which is commonly used in polarimetry.In this work, we chose the convention: The minus signs are necessary to preserve the handedness of the coordinate system when exchanging X and Y.That this convention yields the desired transformation under counterpropagation can be realised by noting that the Stokes parameters of a unitary always transform as (S 1 , S 2 , S 3 ) → (S 1 , −S 2 , S 3 ), however for completeness we will perform the calculation explicitly.In the convention of Eq. ( 22) a linear retarder at an angle θ is written as: and the corresponding unitary in the backwards direction is It then follows that a general waveplate gadget also transforms as the transpose: One could alternatively get around the problem with Eq. ( 20) by introducing two more polarization gadgets implementing Z operators on either side of the gadget in Eq. (20), and making sure that these additional gadgets only act on one propagation direction.For example, by physically displacing the beam paths of the two propagation directions, so that the gadgets act on different spatial modes in the different propagation directions.This would, however, change the interpretation of the experiment with respect to the implementation in the main text, since the transformations in the two propagation directions would no longer be related by a physical symmetry.Instead they would depend on the transformations realised by the additional gadgets.

Obtaining upper bounds for different classes of strategies
We now detail how to obtain an upper bound on the winning probability of the game described in the main manuscript.Let N be the total number of pairs of unitary operators contained in the set M + and M − .Following a uniform distribution, i.e., with probably 1/N, the referee picks a pair of unitary operators (U i , V i ).The player should then employ a quantum strategy to guess whether (U i , V i ) belongs to M + or M − .Let p(±|(U i , V i )) the probability that the player guesses (U i , V i ) ∈ M ± .The probability of such player to win the game is then given by For the qubit scenario considered here, we can analyse the case where unitary gates act backwards by simply considering the case where all involved unitary operators are transposed.This is true because, as discussed earlier, there are only two anti-homomorphisms from SU (d) to SU (d), and for any U ∈ SU (2), we have that U −1 = σ Y U T σ Y .More explicitly, the winning probability for players using the unitary gates backwards is given by Also, as we show more explicitly later, since the success probability is linear function of the strategies, convex combinations of forward and backwards strategies cannot increase the maximal success probability.Hence it is enough to analyse the forward and backwards case.
When the player is restricted to parallel strategies, the most general approach consists of preparing a quantum state ρ, sending part of this state to the operators U i and V i , and then performing a quantum measurement with outcomes labelled as + or −, that is, where M + , M − ≥ 0 are the POVM operators associated to the outcomes + and −, see Fig. 2 in the main text for a pictorial illustration.Parallel strategies may be analysed in the (parallel) tester formalism [29,56], also known as process POVM [57].Let us label the linear spaces corresponding to the input and output spaces as H I and H O respectively.We can then write In the tester formalism, operations are viewed as states and Eq. ( 28) may be written as the generalized Born's rule.More formally, we have that: where {|l⟩} is the computational basis for H I .The operators T + and T − are parallel testers when T + , T − ≥ 0 and their sum respects: where σ ∈ L(H I ) is a quantum state.As shown in Refs.[29,56,57], all parallel strategies as in Eq. ( 28) can be represented by testers such as those in Eq. ( 29), and vice versa.Hence, when optimizing over all possible strategies, instead of considering all possible states ρ and measurements M ± as in Eq. ( 28), we may optimize over all valid testers T ± as in Eq. ( 29).One advantage of using the tester formalism, is that the maximal probability of winning the discrimination game can be written in terms of a semidefinite program (SDP) via the following optimisation problem: s.t.: T + , T − ≥ 0 (33) Following the steps of Ref. [29], the dual problem is given by: min tr(C)/d I (36) where d I is the dimension of H I (for our particular problem, d I = 4).By the definition of dual problem, if we find a linear operator C satisfying the feasibility constraints of inequality (37), inequality (38), and Eq. ( 39), the quantity tr(C)/d I is an upper bound on the maximal success probability.In order to obtain a computer-assistedproof upper bound with fraction of integers, we use standard and efficient floating-point arithmetic algorithms to solve the SDP, obtain an operator C which satisfies the constraints of the dual problem and truncate it in such a way that the feasibility constraints are still satisfied.We refer to our online repository (see Code Availability) for an implementation of this procedure and to Ref. [29] for a detailed explanation on how to perform the truncation step.
When the player is restricted to causal strategies (also referred to as sequential strategies), the most general approach consists of preparing a quantum state ρ, sending part of this state to the operators U i (or to V i ), applying a quantum channel E , then performing the operation V i (or U i ), and finally performing a quantum measurement with outcomes labelled as + or −, that is: Using the concept of sequential testers [29,56], we can also write the problem of finding the optimal causal strategy as an SDP.Since there is a notion of causal order, we label the input and output space of the first operation as H I 1 and H O 1 respectively.Analogously, we use H I 2 and H O 2 for the second operations.If the player uses the operation U i first and V i second, we have that U i : Following Ref. [29], the primal and dual problem for causal strategies are respectively given by max s.t.: T + , T − ≥ 0 (41) introduce Poissonian noise in this type of estimation.However, the statistical method we use to determine the confidence in our conclusion -the calculation of the P-value -allows us to make statements about the underlying probability distribution without directly estimating it.Specifically, that its expectation value exceeds the bound imposed on the winning probability of any strategy with a definite time direction.In order to filter out background events resulting from various back-reflections in the experimental setup, as well as detector dark counts, two-fold coincidence events between the signal and idler photons were used to time filter the detection events.
The superconducting nanowire detectors used in the experiment have a slight polarization dependence in their detection efficiency, and due to the different pairs of unitaries generating different target qubit states the event rates for different implemented unitaries varied.This difference in efficiency was not necessary to account for, because the number of events for each pair of unitaries was truncated, in reverse chronological order, to match the setting with the fewest events.To find the numbers of rounds won and lost, the data was sampled from once, drawing 10 6 different samples from unique, chronologically ordered (for each setting) detection events.The exact number of won and lost rounds in this sampling were 994, 512 won and 5, 488 lost.
A detection efficiency imbalance is also present in the two output ports of the interferometer, corresponding to the two different measurement outcomes of the control qubit.This efficiency difference could quite easily be characterised and corrected for, however such actions are equivalent to classical post-processing and is captured by the indefinite tester.Imbalanced detection efficiency could therefore not lead to a violation of the bound, and is not necessary to correct for since the data already violates the bound.This is a different way of stating the semi-device independence of our methods.
The measurement of the fidelity between the unitary implemented in one direction and the transpose of the unitary in the other direction was performed with coherent light.To estimate the fidelity, the two unitaries were first fitted to the data using a maximum likelihood estimation and then the fidelity was calculated by evaluating the following average: taken over 1000 Haar-random states |Ψ⟩.This was done in every step of a Monte-Carlo simulation to estimate the measurement uncertainties induced by the waveplate errors.

Semi-device independence of demonstration
In this section we will elaborate on what is meant by our certification methods being semi-device independent.Our usage of this term is consistent with the notion of semi-device independence introduced in [32].That our demonstration is semi-device independent means that the measurement that the player performs does not have to be characterised.Equivalently, the player does not have to trust that their measurement device implements a specific measurement.It is a statement about the required assumptions on the measurement.
The basis for the claim that our demonstration is semi-device independent lies in the fact that the derivation of the bounds for the strategies depicted in Fig. 2.a-c in the main text included an optimization over all possible binary measurements the player could perform.This means that there is no measurement that a player using these strategies could perform that would allow them to violate the bounds we derived.Hence, a violation of these bounds has the same interpretation regardless of what measurements the player performed.
It is worth noting that semi-device independence does not imply that the ability of the player to violate the bounds is independent of the measurement they perform.Indeed, measurement imperfections can reduce the winning rate of the player.This can cause them to fail to certify that they employ a certain strategy, even if they do in fact employ that strategy.
A concrete consequence of the semi-device independence is that imperfections in the measurement do not need to be accounted for, and the measurement itself does not need to be modelled in the data analysis.This is in contrast to device-dependent methods, which rely on well characterised measurements to draw conclusions about the observed results.A device-dependent verification method that frequently appears in experimental quantum information science is the witness operator, for example entanglement witnesses or causal witnesses.Such witness operators can also be constructed for the task described in the main text.A witness operator Ŝ can be used to certify a certain statement about a quantum system or process by experimentally evaluating its expectation value, and confirming that it satisfies some bound: Empirically evaluating ⟨ Ŝ⟩ requires the witness operator to be decomposed in terms of experimentally measurable observables, and the expectation values of these observables to be estimated.Imperfections in the measurement devices induce uncertainties in these estimates, which in turn propagate as uncertainties into the expectation value of the witness operator.A statistically significant violation of the inequality (64) therefore requires well characterised measurement devices.
FIG.2.Classes of game strategies.The figure depicts the different strategies for the game described in the main text and their corresponding maximum winning probabilities p.These maximum winning probabilities are obtained through an optimization over all possible choices of the resources shown in dark blue, and hold for pairs of unitaries (U, V) uniformly randomly picked from the sets M + and M − .The state ρ, for example, is allowed to contain any number of auxiliary degrees of freedom, and analogous statements hold for the measurement M, channel E and process W. The three strategies differ in how they are able to access the gates picked by the referee.The strategies (a)-(c) are shown here in the forwards time direction, but are also valid in the backwards time direction in which both gates are transposed.Each subsequent strategy is strictly better than the previous one, and only players who have access to a quantum time flip process can win the game with unity probability.(a) Parallel gate order.(b) Causally ordered gate sequence.(c) Process without a definite causal order.(d) Quantum time flip.

323 FIG. 3 .
FIG. 3. Experimental apparatus.(a)A type-II spontaneous parametric down-conversion source generates frequency degenerate single-photon pairs at 1546 nm in a ppKTP crystal (top).The signal photon is sent to a heralding detector, while the idler photon is routed to a balanced bulk beam-splitter and coupled into single-mode fiber.Blue (green) arrows indicate the photon path corresponding to the control-qubit state |0⟩ C (|1⟩ C ).A piezo-electric actuator attached to one of the fiber couplers allows for control over the interferometric phase, pairs of HWPs / QWPs are used in combination with fiber polarization controllers for polarization compensation through the fibers.This use of redundant polarization elements both improves and simplifies the compensation procedure[35] After the initial beam-splitter two fiber circulators guide the photon through the V-gadget.Propagating through the gadget in the 'forwards' direction implements the unitary V, while propagating 'backwards' has the effect of applying the transposed operation relative to the 'forward' direction.One of the two paths through first the gadget therefore results in the V T being applied instead of V. Two additional circulators then route the photon through a gadget implementing U (U T ) in 'forwards' ('backwards') direction.Finally, the signal photon is sent to a fiber beam-splitter, which applies a Hadamard gate on the path degree of freedom, and correlates the two spatial output modes with the sets M + , M − .Detection is performed by superconducting nanowire single-photon detectors (SNSPDs) housed in a 1 K cryostat.Additional QWP/HWP pairs are used to compensate fiber-induced polarization rotations.(b) The fiber circulators route the light from port 1 → 2, from 2 → 3 and block light entering in port 3. The bidirectional boxes in Fig.1are realised using sets of three waveplates.Depending on the propagation direction, they implement either the unitary operation U/V or U T /V T .

Box 1 .
Sets of unitary operators.

FIG. 5 .
FIG. 5. Observed relative outcome frequencies.The figure shows the observed relative frequency of answers f rel± in the quantum flip game for all the pairs of unitaries in the sets M + and M − .For the gates in the set M + (M − ) the game is won when the player outputs the answer '+' ('−').The observed average winning frequency is 0.9945.Since the bars corresponds to the actual number of times the different outcomes were recorded there is no associated uncertainty (see Methods).
and min tr(C)/d I