Higher-order Process Matrix Tomography of a passively-stable Quantum SWITCH

The field of indefinite causal order (ICO) has seen a recent surge in interest. Much of this research has focused on the quantum SWITCH, wherein multiple parties act in a superposition of different orders in a manner transcending the quantum circuit model. This results in a new resource for quantum protocols, and is exciting for its relation to issues in foundational physics. The quantum SWITCH is also an example of a higher-order quantum operation, in that it not only transforms quantum states, but also other quantum operations. To date, no higher-order quantum operation has been completely experimentally characterized. Indeed, past work on the quantum SWITCH has confirmed its ICO by measuring causal witnesses or demonstrating resource advantages, but the complete process matrix has only been described theoretically. Here, we perform higher-order quantum process tomography. However, doing so requires exponentially many measurements with a scaling worse than standard process tomography. We overcome this challenge by creating a new passively-stable fiber-based quantum SWITCH using active optical elements to deterministically generate and manipulate time-bin encoded qubits. Moreover, our new architecture for the quantum SWITCH can be readily scaled to multiple parties. By reconstructing the process matrix, we estimate its fidelity and tailor different causal witnesses directly for our experiment. To achieve this, we measure a set of tomographically complete settings, that also spans the input operation space. Our tomography protocol allows for the characterization and debugging of higher-order quantum operations with and without an ICO, while our experimental time-bin techniques could enable the creation of a new realm of higher-order quantum operations with an ICO.

(Dated: August 7, 2023) The field of indefinite causal order (ICO) has seen a recent surge in interest.Much of this research has focused on the quantum SWITCH, wherein multiple parties act in a superposition of different orders in a manner transcending the quantum circuit model.This results in a new resource for quantum protocols, and is exciting for its relation to issues in foundational physics.The quantum SWITCH is also an example of a higher-order quantum operation, in that it not only transforms quantum states, but also other quantum operations.To date, no higher-order quantum operation has been completely experimentally characterized.Indeed, past work on the quantum SWITCH has confirmed its ICO by measuring causal witnesses or demonstrating resource advantages, but the complete process matrix has only been described theoretically.Here, we perform higher-order quantum process tomography.However, doing so requires exponentially many measurements with a scaling worse than standard process tomography.We overcome this challenge by creating a new passively-stable fiber-based quantum SWITCH using active optical elements to deterministically generate and manipulate time-bin encoded qubits.Moreover, our new architecture for the quantum SWITCH can be readily scaled to multiple parties.By reconstructing the process matrix, we estimate its fidelity and tailor different causal witnesses directly for our experiment.To achieve this, we measure a set of tomographically complete settings, that also spans the input operation space.Our tomography protocol allows for the characterization and debugging of higher-order quantum operations with and without an ICO, while our experimental time-bin techniques could enable the creation of a new realm of higher-order quantum operations with an ICO.
In spite of this large body of work, there has not yet been a complete experimental characterization of process with an ICO.Instead, previous work on ICO has mainly focused on designing and measuring witnesses to essentially provide a yes or no answer to the question "does this process have an indefinite causal order?"On the one hand, this is because no concrete protocol for a complete characterization has yet been presented.On the other hand, this is because the number of experimental settings required for a complete characterization has been prohibitive in past experiments.Here, we overcome both of these hurdles, first presenting a protocol to perform "higher-order process matrix tomography," and then implementing a new experimental method to realize the quantum SWITCH based on time qubits, based on the proposal of [47].Our new passively-stable implementation is based on active optical elements, and it allows us to acquire sufficient data (estimating almost 10,000 distinct probabilities) to fully reconstruct a process matrix demonstrating an ICO for the first time.
In the two-party quantum SWITCH, Alice and Bob each act on a target system (typically taken to be a qubit) in their local laboratories.This target qubit is sent first to one party and then to the other.The order in which the target qubit is shared between the two is coherently controlled via a second control qubit.If the control qubit is prepared in a superposition state, then the two parties act on the target qubit in a superposition of orders (Fig. 1a-c).The quantum SWITCH and the so-called quantum time flip [48][49][50], are the only processes which do not respect our standard notions of causality that have been experimentally implemented to date.The quantum SWITCH is an example of a HOQO, in the sense that its inputs are not only the control and target qubits, but also Alice and Bob's operations.
All experimental realizations of the quantum SWITCH have been accomplished by encoding both the control and the target systems in a single photon.Typically, the control system is encoded in a path degree-of-freedom, which then determines the order that the photon is routed between the two parties.In practice, this means that a photon is placed in a superposition of the two paths using a beamsplitter, and these paths are then looped between two parties in a manner mimicking the paths of Fig. 1c.The parties then act on a different degree of freedom, such as polarization [35], time bins [37], or orbital angular momentum [44].The result of all of these approaches is essentially a Mach-Zehnder interferometer, which must be phase stabilized. 1 Stabilizing the phase for long enough to acquire the required data for full higherorder quantum process tomography presents a daunting experimental challenge.
We overcome this challenge by implementing a new passively-stable quantum SWITCH.In our experiment, the control system is encoded in a time-bin qubit, the target qubit is encoded in the polarization of the same photon, and active optical switches are used to route the photon between the two parties in superposition of both orders, as in the theoretical proposal of [47].While two other experiments have achieved an intrinsically stable phase, using a Sagnac-like approach [37,51], it is not clear how to scale these methods to multiple-parties.Our approach, however, has straight-forward generalization to multiple parties [47], making it a promising new experimental method to create an ICO.
Recently, there has been some discussion in the community whether such photonic implementations of the quantum switch are simulations of an ICO [52,53], with some concluding that they may only have an ICO in a "weak sense" [54], while others conclude that the experiments do have an ICO [55,56], or that they at least have a quantifiable resource advantage [57].Here we do not address this debate, but we make use of the mathematical formalism of processes matrices and HOQO to describe our physical experiment.
One method to certify ICO is made via the violation of a so-called causal inequality [6].This is a device- 1 Polarization has also been used [36,44] as a control system, but also in this case the polarization is used to route the photon into two paths in superposition, which also requires a stable phase.where Alice (Bob) acts before Bob (Alice) on the target system.c) A superposition of orders with the control qubit in the state 1 The principle of the process tomography on the quantum SWITCH.Alice and Bob perform projective measurements in three different basis and then prepare four different linearly independent states in their output.The same input states are prepared at the past-target.The past-control is fixed to the superposition state for generating the indefinite order.Finally, the future-control is measured in different basis, while the future-target is traced out (the two measurements shown are those that are implemented experimentally; to span the space, a third measurement is needed).
independent technique, similar to the use of a Bell violation to verify entanglement [58].Unfortunately, it is not yet known if one can implement a quantum process that deterministically violates a causal inequality; moreover, it has been shown that the quantum SWITCH cannot violate causal inequalities [17].Instead, in the first implementation of the quantum SWITCH [35], the ICO was indirectly proven by performing a game, where a player has to decide if two unitary gates either commute or anti-commute (see App. B 1).By winning that game more than one could with a definite causal order, it was concluded the experiment did not have a definite causal order.This method can, in fact, be reframed, in terms of a causal witness [13].A causal witness is a measurement which can be used to verify if a process is causally non-separable (i.e. if it has an ICO), and it has been experimentally implemented for the quantum SWITCH [44,45].Unlike a causal inequality, a causal witness is not device independent, requiring the assumption that the experimenter knows the correct quantum description of the experiment.Recently, progress has been made by relaxing the complete device independent approach, allowing the certification of causal non-separability under semi-device independent assumptions [46,59,60], Bell locality-like assumptions [15,45], and additional device independent no-signalling assumptions [61,62].
In this work, we implement full experimental higherorder process tomography of the quantum SWITCH.For this goal, we generalise the ideas from quantum state and quantum process tomography [2,4,[63][64][65][66][67][68][69] to tackle tomography of arbitrary higher-order processes, including those without a definite causal order.In particular, we show that it is possible to construct tomographically complete measurement settings on arbitrary quantum processes by using a tomographically complete set of input states, spanning the input state space; a tomographically complete set of measurement-repreparation channels, spanning the input operation spaces; and a tomographically complete set of quantum measurements spanning the output state space (see Fig. 1d).We then employ these ideas to experimentally perform higher-order quantum process tomography on the quantum SWITCH.
The rest of the paper is organized as follows.In Section II we introduce the theory of quantum process matrices, using the quantum SWITCH as a paradigmatic example, and we present our causal tomography protocol.In Section III, we discuss our new passively-stable architecture for the quantum SWITCH.Section IV presents our experimental results, and we conclude in Section V.

A. Process matrices and the quantum switch
The expression quantum process is a general term used to refer to the dynamics of quantum systems, and its precise meaning may depend on the context.For instance, when analysing transformations between quantum states, quantum process refers to a quantum channel, which may be unitary (associated with closed quantum systems and mathematically described by unitary operators) or non-unitary (associated with open quantum systems and mathematically described by Completely Positive Trace Preserving (CPTP) linear maps).In this scenario of transformations between states, one can experimentally determine the dynamics by means of what is known as quantum process tomography [64][65][66].To do so, a complete set of known quantum states is fed into an unknown quantum process E and a complete set of measurements is performed on the output of the underlying process for each input state [69].When performing standard process tomography, one reconstructs, for example, the chi matrix χ, which takes quantum sates as inputs and returns quantum states as outputs [70].The chi matrix is often called a process matrix; however, we stress that this chi or process matrix is different from the process matrices discussed in the field of HOQO and ICO, the case which we address here.
In this work, we analyse transformations between quantum channels, hence we use the word process to describe the dynamics between quantum operations.An operation that transforms a quantum operation is sometimes referred to as a higher-order quantum opera-tion.Such operations have found applications in several branches of quantum information processing [2-4, 32, 71-73].The formalism of higher-order transformations, is also particularly useful for describing quantum processes which may not respect a definite causal order, such as the notorious quantum SWITCH [8,29], the main object analysed in this work.In its most general form, the quantum SWITCH is a process that, transforms a pair of unitary operators (U A , U B ) into another operator which is a coherent superposition of the composition U A U B and U B U A .In mathematical terms, the quantum SWITCH is the transformation where the first system of the RHS of Eq. 1 is referred to as the control system, since the order in which U A and U B will be performed may be controlled by setting the control qubit state.The second system is referred to the target system, since it is the system on which the unitary operators act.

B. Choi-Jamio lkowski isomorphism and quantum operations
Higher-order transformations such as the quantum SWITCH may be conveniently described by means of a process matrix [6], a formalism which is heavily based on the Choi-Jamio lkowski (CJ) isomorphism [74,75], a method to represent linear maps as linear operators, and linear operators as vectors.Let H in and H out be finite linear (Hilbert) spaces associated with the input and output.Let U : H in → H out be a linear operator, its process vector |U ⟩⟩ ∈ H in ⊗ H out is the defined as2

|U ⟩⟩
where {|i⟩} i is the computational basis.Notice that, the Choi vector of the identity operator is given by which is equivalent to a maximally entangled state up to normalisation.
Let L(H) be the set of all linear operators acting on H. Let C : L(H in ) → L(H out ) be a linear map, its Choi operator C ∈ L(H in ⊗ H out ) is defined as Then, the action of any linear map C on a state ρ can be written in terms of the Choi operator C as where ρ T is the transpose of ρ in the computational basis and ρ is an arbitrary density operator acting on H in .The Choi-Jamio lkowski isomorphism is very useful to represent quantum operations, due to the fact that a linear map C : L(H in ) → L(H out ) is completely positive if and only if its Choi operator C ∈ L(H in ⊗ H out ) respects C ≥ 0, and the map C is trace-preserving if and only if Tr out (C) = 1 Hin .Since quantum channels are completely positive trace-preserving maps, all quantum channels have a simple and direct characterisation in terms of their Choi operators.Before finishing this subsection we also remark that if C is a unitary channel, that is C(ρ) = U ρU † for some unitary operator U , direct calculation shows that its Choi operator may be written as C = |U ⟩⟩⟨⟨U |, where |U ⟩⟩ is defined in Eq. 2.
A quantum instrument is a quantum operation which has a classical and a quantum output, and it formalises the concept of a quantum measurement which has a post-measurement quantum state.Mathematically, a set of linear maps are completely positive and C := i C i is trace preserving.In the Choi operator picture, this is equivalent to having C i ≥ 0 and Tr out ( i C i ) = 1 Hin .A simple and useful class of quantum instrument is the class of measure-andreprepare instruments.In its most basic form, a measure and reprepare instrument simply performs a measurement described by the operators 3 {M i }, and reprepares some fixed state σ.Its linear map is described by R i (ρ) := Tr(ρ M i )σ, and its Choi operators are given by C. The quantum switch as a process matrix We are now in position to present process matrices which describe transformations between the quantum channels of different parties.We start by presenting the process vector describing Fig 1a, which is simply a process where a system flows freely from a common past target space H Pt to Alice's input space H Ain .Alice may perform an arbitrary operation as the state goes from H Ain to H Aout .Later, the state goes freely from Alice's output space H Aout to Bob's input space H Bin .Bob may then perform an arbitrary operation as the state goes from H Bin to H Bout .Finally, the state goes freely from Bob's output space H Bout to a common future target space H Ft .The process vector of this quantum process is and its process matrix is given by where A → B indicates that Alice acts before Bob.(Note that we have not included Alice or Bob's operations in this description; this will be introduced in Sec.II E.) Analogously, we may define the process where Bob acts before Alice, which will lead to a process vector and its process matrix is given by The quantum SWITCH is a process which allows one to coherently alternate between |A → B⟩ and |B → A⟩.For that, we allow the common past and common future to have another system, denoted as a control system, which will be able to coherently alternate between the ordered process.More formally, the common past and common future space are now described by H P = H Pc ⊗ H Pt and H F = H Fc ⊗ H Ft respectively, and the Choi vector of the quantum switch is given by which corresponds to the process matrix Almost all known applications of the quantum SWITCH, e.g., computational advantages [17], channel discrimination [8], reducing communication complexity [21], semi-device-independent [59] and deviceindependent certification of indefinite causality [61,76], do not require the general form of the quantum SWITCH is presented in Eq. (11).Rather, in such applications, one starts with the control qubit in the |+⟩ := |0⟩+|1⟩ √ 2 , so that the process state corresponds to a coherent superposition of processes described by: Additionally, for all such applications, one does not make use of the future target system, hence this qubit is often discarded.Mathematically, discarding a system corresponds to the partial trace.Hence, we construct the simplified version of the SWITCH as In this work, we focus on the simplified quantum SWITCH, and as is usual in the literature, we use "quantum SWITCH" to refer to the process described in Eq. ( 13).

W
Figure 2. Probing a quantum process W . Pictorial illustration on how to probe a quantum process bipartite quantum process W with a common past and common future.Here ρw are quantum states, {A a|x } and {B b|y } are quantum instruments, and {M c|z } are quantum measurements.

D. The general process matrix formalism
The process matrix formalism allows one to assign a matrix which perfectly describes transformations between arbitrary quantum objects, in particular, to transform quantum channels into quantum channels.The normalisation constraints from quantum channels (or more general quantum objects) and a generalised notion of completely positive inputs lead to constraints on valid process matrices.In a nutshell, when focused on quantum channels, a matrix W is a process matrix if it is positive semidefinite and respect a set of affine constraints arising from the channel normalisation conditions.These affine constraints are described in several Refs.such as [6,17] and may be viewed as causal constraints (for instance, they prevent local loops or the possibility of obtaining negative probabilities via Born's rule).

E. Measuring a process matrix
One of the main applications of the process matrix formalism is to provide mathematical methods to analyse the dynamics of a quantum process and to predict the outcomes of measurements performed on a quantum process.Below, we describe the scenario considered in this work: is the process matrix which describes a bipartite scenario with a common past (target) and common future (control).

ρ ∈ L(H P
) is a quantum state on the common past (target) space.5. M c ∈ L(H F ) are the measurement operators on the common future (control) space.

A a ∈ L(H
In the scenario described above, if W is the process matrix, one inputs the state ρ into the common past, Alice performs the instrument {A a }, Bob performs the instrument {B b }, the measurement {M c } is performed in the future, the probability that Alice obtains the outcome a, Bob obtains the outcome b, and the future obtains the outcome c is given by In practice, it is often convenient to have indices to label states, instruments, and measurements.In this work, we will then use {ρ w } to denote a set of states acting in the common past, {A a|x } for a set of instruments in Alice's space (a labels the classical outcome of the instrument and x the choice of instrument), {B b|y } for a set of instruments in Bob's space (b labels the classical outcome of the instrument and y the choice of instrument), and {M c|z } for a set of measurements in the future space (c labels the classical outcome and z the choice of measurement).We can then define the setting operators 4 as (15) which leads us to the so-called "generalised Born's rule": F. Process matrix tomography The goal of quantum tomography is to completely characterise a quantum object by performing known measurements on it.Before discussing process matrix tomography, we revisit the standard case of quantum state tomography, where one aims to characterise an unknown state by analysing the outcomes obtained after performing known measurements on it.If M a|x ∈ L(C d ) are known measurement operators, one can make use of the probabilities p(a|x) = Tr ρ M a|x to uniquely reconstruct the unknown state ρ.When the set of operators {M a|x } spans the linear space of L(C d ), the operator ρ may be obtained via p(a|x) by standard linear inversion methods.
For qubit states, a standard set of tomographically complete measurements is formed by the three Pauli observables X, Y , and Z, which are associated with the measurement operators via their eigenprojectors: In particular, the standard measurement operators from the set where These measurements are linearly independent, forming a (non-orthonormal) basis for L(C 2 ).
We now consider the task of performing tomography of a qubit channel.As discussed in section II B, every quantum channel C : L(H in ) → L(H out ), can be represented by its Choi operator C ∈ L(H in ⊗ H out ).In this case, tomography can be carried out by preparing a set of states {ρ w } w , ρ w ∈ L(H in ) and performing a complete set of measurements on each state.For qubits, the standard measurements are where M i|j is a POVM element, the label i stands for the outcomes, and j for the choice of measurements.Hence, M is a set with three dichotomic measurements with POVM elements given by Note that, due to the normalisation of probabilities, the measurements of some measurement operators are unnecessary.However, in practice, the orthogonal measurements M 2|j are often measured to aid in the data normalization.We will include them here, with a view to our experiment.From these input states and measurements, one estimates the probabilities p(a|x, w) = Tr C(ρ w ) M a|x , which can also be written as = Tr C S a|xw (25) in the Choi formalism, where S a|xw := ρ T w ⊗ M a|x , where S a|xw is a setting operator for standard quantum process tomography.Now, one way to perform complete tomography is by ensuring that the setting operators S a|xw span the space L(H in ⊗ H out ).Also, thanks to a property usually referred to as "local tomography" [77] if the set of operators {ρ T w } w spans L(H in ) and the set {M a|x } a,x spans L(H out ), then the set of setting operators ρ T w ⊗ M a|x w,a,x spans L(H in ⊗ H out ).In other words, full quantum channel tomography is always possible if one measures a set of characterised setting operators {S a|xw } a,x,w that span the space L(H in ⊗ H out ).
In principle, measuring a set of setting operators {S a|xw } a,x,w which span L(H in ⊗ H out ), is actually "overkill".More specifically, due to the normalisation condition Tr out (C) = 1 in , respected by quantum channels, there are linear operators in L(H in ⊗ H out ) which cannot be written as linear combinations of quantum channels,e.g., |0⟩⟨0| in ⊗ 1 out .One can then consider a set of operators {S a|xw } a,x,w which spans the set of quantum channels, a subspace with dimension strictly smaller than the dimension of L(H in ⊗ H out ).In particular, the linear space L(H in ⊗ H out ) has dimension of d 2 in d 2 out , and the linear span of quantum channels in L(H in ⊗ H out ) has dimension of d 2 in (d 2 out − 1).We emphasize, however, this does not represent a problem; in fact, in practice, more using an over-complete measurement set is known to minimise the experimental errors in standard quantum tomography [78].
Finally, we now consider tomography of process matrices such as the quantum switch illustrated in Fig. 1d.As discussed before, one way to perform tomography is to measure setting operators S abc|xyzw which span the linear space L ( Also, thanks to local tomography, we may consider sets of states and measurements which span the local space individually.We then consider the set of states given by Eq. 17 and the set of measurements is given by Eq. 20. For tomography of a higher-order process matrix, we then consider the set of measure-and-reprepare instruments (to be used as inputs for Alice and Bob's channels) given by all combinations of the two sets above, that is where The interpretation of Eq. ( 27) is the following, first, the measurement j with POVM elements {M i|j } i is performed, then, the state |ψ k ⟩ is prepared.Notice that, in our measure-and-reprepare instruments, the prepared state |ψ k ⟩ is independent of the measurement choice j and the obtained outcome i.One can then perform full tomography with the setting operators where |ψ w ⟩⟨ψ w | are the 4 different quantum states in S, A a|x and B b|y are each the 2 × 3 × 4 = 24 instrument elements5 of set R defined in Eq. ( 26), and 2 × 3 = 6 measurement elements for the future control space are the measurements M c|z of M (Eq.( 20)).In total, we then In this row, an incident single photon is deterministically placed in a superposition of the "short" and "long" time bins, using an ultra-fast optical switch (UFOS).In order to achieve passive phase stability, the measurements, shown in rows b) and c), are actually implemented using the same device as in row a).However, for clarity, we have mirrored the device horizontally.b) This row indicates our measurements in the Z-basis.
Here, the UFOS remains in the "bar state".After traversing the device, the photon remains in a superposition of the two incident time bins (but spread over two paths).In this situation, simply resolving the arrival time of the photon projects into the Z-basis.c) A schematic of our deterministic measurement in the Y-basis.Here the UFOS alternate between the "cross" and bar states so that the short (long) time bin now takes the long (short) path.In this manner, the two time bins interfere on the beamsplitter, so that finding the photon in the upper (lower) path corresponds to projecting the time bin onto |y+⟩ (|y−⟩).
measure 4×24×24×6 = 13824 different settings.We remark that in this tomography approach, we do not make use of any constraints on the process matrices.One could also to reduce the number of required settings by imposing the assumption that valid process matrices necessarily belong to a particular linear subspace, as discussed in subsection II D.
In an ideal theoretical scenario, if we obtain the probabilities p (abc|xyzw) = Tr W S abc|xyzw for a tomographically complete set of setting operators, standard linear inversion will uniquely identify the process W .However, due to finite statistics, we never obtain the exact probability p (abc|xyzw), but an approximation from measured frequencies.Also, due to measurement precision and other possible sources of errors, we cannot expect to obtain an exact reconstruction of the process matrix.Indeed, performing direct linear inversion often results in unphysical quantum states or processes.Instead, we aim to estimate a physical process matrix that agrees best with the experimental data.
In order to estimate our experimental process matrix W exp , we perform a fitting routine to find the process matrix that best describes our measured data.We find that minimizing the least absolute residuals works quite well.To do this, we numerically search for a process matrix W exp that minimizes the following expression: where N Settings is the number of setting operators, and the minimisation is further subject to the constraint that W exp is a valid process matrix.This minimisation can be performed by means of semidefinite programming (SDP), and may be implemented with the help of numerical libraries such as MOSEK.Our MATLAB code implementing this, is available at [79].The first term under the root are the experimentally measured probabilities, while the second term corresponds to what is predicted by quantum theory for the characterised settings S abc|xyzw .Since W exp is the only unknown quantity, the minimization of Eq. 29 delivers a process matrix that fits best to our experimental data, making no assumptions about the specific form of the process matrix.

III. EXPERIMENT A. Time-Bin Quantum SWITCH
To date, most previous implementations of the quantum SWITCH were based on bulk optics.Since photonic quantum SWITCHes are essentially interferometers, inevitable phase drifts limit the measurement time or require active stabilization [35,45].Furthermore, since adding more parties means that the dimension of the control system must be increased, scaling up previous architectures requires more and more spatial modes to be transmitted through the same optic, making it difficult to create a SWITCH with more than two parties.Here, we present a passively-stable, fiber-based architec- shows the Sagnac SPDC source that generates heralded single photons.In the yellow section, we illustrate the asymmetric MZI to generate and measure the time-bin control qubit.The orange section shows the target qubit preparation stage, which consists of a PBS and two waveplates.The green area hosts the fiber-based quantum SWITCH.The heralded and heralding photons are detected using SNSPDs, shown in the blue area.By triggering off of the detection event of a heralding photon we use a pulse generator to control the optical switches in the setup.The sub-panels i) -vi) in panel b) show the functionality of the quantum SWITCH.By controlling the state of the optical switches, we route the two time-bins in different orders through Alice and Bob's quantum channels.After the SWITCH operation, the target qubit has experienced the action of the quantum channels in a different order depending on the state of the time-bin qubit.
ture for the quantum SWITCH where the control system is encoded in a time degree of freedom of the photon.Thus, in our architecture, although the dimension of the control system must still be increased at the same rate, this can be done using additional time bins, but only one spatial mode must traverse each optical element.Furthermore, by using the same interferometer to prepare and measure the control system, all phase fluctuations cancel out, making our setup passively stable.This is important for process tomography, as we must perform many measurements, and the experiment must remain stable during this time.
To create the time-bin qubit that we will use to control the order, we start by generating a photon pair, λ = 1550 nm, using spontaneous parametric down conversion (SPDC).One photon of the pair is directly detected to herald the other photon, setting a time reference for the experiment.The second photon is sent to a 50/50 beamsplitter-a fiber directional coupler (FDC)which splits the incoming mode into two fibers of different lengths.We then deterministically recombine these two fiber paths using an ultra-fast fiber optical switch (UFOS), see Fig. 3a (and also the yellow section of Fig. 4a) [80].To do so, we generate an electronic pulse, triggered off of the timing reference generated by detecting the first photon.This pulse is sent to the UFOS which change its state to first route the "photon component" from the short path, followed by the "photon component" in long path, into the upper output mode of the UFOS-MZI (Fig. 3a ii and iii).The result is that the second photon is left in an equal superposition of two time bins in a single fiber (Fig. 3a iv).Note that because the short time bin is transmitted through the FDC, while the long time bin is reflected, one mode picks up a reflection phase, while the other does not.Hence, in our experiment we prepare the control qubit in the state where we have labelled the modes as the "short" ("long") state |S⟩ C (|L⟩ C ), when it has taken the short (long) fiber path of the interferometer.The spacing between these two time bins is 150 ns, which is set by the response time of our UFOS.
The UFOS we use to route the photon are BATi 2x2 Nanona fiber switches.In addition to creating the time bin qubit, they allow us to route the photon in a controlled way through the quantum SWITCH.Our UFOSs have a response time of 60 ns, with a maximal duty-cycle of 1 MHz, and a cross-channel isolation greater than 20 dB for any polarisation (see [81,82] for more details).
Having created the time bin control qubit, we need to apply the quantum SWITCH operation to the target system, which we encode in the polarization degree-offreedom of the same photon.To route the photon, we use two additional UFOSs and follow the protocol illustrated in Fig. 4b.i-vi.In particular, we send a voltage pulse train consisting of three low levels and two high levels to the UFOS's.During each low level, the fiber switches are in a "cross state" (output modes are swapped with respect to the input), while during a high level the switch state is set to the "bar state" (input modes transmitted to output modes).As the time-bins approach the quantum SWITCH (Fig. 4b.i) the UFOS's are initially in the cross-state, which routes the short time bin |S⟩ C through Bob's quantum channel (Fig. 4b.ii).Then the UFOSs change to the bar state (Fig. 4b.iii), which sends |L⟩ C through Alice's channel, while |S⟩ C travels over a fiber from the RHS-UFOS to the LHS-UFOS.Then the UFOSs see a low voltage level, and their state is set to cross (Fig. 4b.iv).This sends |S⟩ C through Alice local laboratory, while |L⟩ C loops back to the LHS switch.In Fig. 4b.v the UFOS's are in bar state and hence, |L⟩ C passes through Bob's channel.At this point |S⟩ C exits the quantum SWITCH.Finally, the fiber switches are set to the cross state (Fig. 4b.vi) so that |L⟩ C leaves the quantum SWITCH.At this point, depending on the control state, the target system has experienced a different order of Alice and Bob's actions, which, as we will describe shortly, act on the polarization state of the photon.Note, that all the lengths of the fibers in the quantum SWITCH are set to ensure the correct routing of time-bins spaced by 150 ns.
The time-bin quantum SWITCH from Fig. 4b is placed in the full fiber-based setup (Fig. 4a), in which the time-bins are prepared and measured.The quantum SWITCH itself is shown in the green section of panel a).The type-II SPDC photon source [83] is shown in the red section 6 .Here, a PBS reflects the heralding photon to a single photon detector (blue area), while the other photon is transmitted to the Mach-Zehnder-like time-bin generation interferometer explained above (yellow section).Following this, we have a photon encoding a time-bin qubit in the state (|S⟩ C − i |L⟩ C )/ √ 2 in the "upper" output of the interferometer (clockwise direction).The counter-clockwise path of the loop (lower UFOS-MZI output mode) hosts an optical circulator with an empty port to filter out misguided photons, which can arise from the imperfect extinction ratio of our UFOSs.Next, the target system is encoded in the photon's polarization.For this we use a polarizing beam splitter (PBS) and a set of a quarter-and a half-waveplate, shown in the orange section of Fig. 4a.Then we apply the 2-SWITCH operation to the target system described above (green section of Fig. 4a and Fig. 4b), where we implement Alice and Bob's instruments using short free-space sections containing waveplates and polarizers.
After exiting the SWITCH, the photon follows the fiber loop in clockwise direction and approaches the MZI used for time-bin generation (yellow section); now, from the opposite direction in the lower path.At this point, we can decide to measure the control qubit in the computational (Z) or a superposition (Y ) basis.These measurements are illustrated in Fig. 3b and c, respectively.For measurements in the computational basis, both timebins are routed by the UFOS along the lower path of the MZI, after which the two time bins split up at the FDC, and are then sent to detectors in the blue region.By measuring the arrival time, with respect to the herald detection, we can distinguish between the short and long time bins.To measure in a superposition basis, we use UFOS-MZI to send the time bins through the opposite paths of the interferometer (|S⟩ C takes the long path and |L⟩ C takes the short path) so that they arrive at the FDC at the same time.In this case, interference occurs at the FDC, and detecting a photon at exiting the upper (lower) 6 Our source is in a Sagnac configuration, although for this experiment we only pump the Sagnac loop in one direction so as not to generate polarization entanglement With this in place, we collect the measurement statistics from different measurement settings by detecting coincidence events between the heralding photon and the FDC output or the circulator output.For each experimental configuration, we record ≈ 1600 coincidence counts (≈ 21, 000 total single photon counts) over 10 s at the FDC and circulator output.The photon source generates ≈ 1, 480, 000 single photons (≈ 116, 000 coincidence events) in 10 s before the experiment.Thus, our entire quantum SWITCH experiment has an overall insertion loss of ≈ 18 dB.All of our measurements are carried out with superconducting nanowire single photon detectors (SNSPD) from PhotonSpot Inc.The result, for a representative set of measurements, is shown in Fig. 5. Therein, the bars are the theory for an ideal quantum SWITCH with the control qubit in |y−⟩, described by the process matrix W y− , while the points are our experimentally measured data.Already, one can observe good agreement between theory and experiment.Using unitary operations (rather than measure and reprepare instruments), we can also play the anticommuting/commuting gate discrimination game, as in Ref. [35].We find a success probability of 0.974 ± 0.018, indicating a high-fidelity of our implementation (the full details of this measurement are presented in the Appendix B 1).

B. Experimental Process Matrix Tomography
We will now present our experimental reconstruction of the process matrix of our time-bin quantum SWITCH.
As discussed in Sec.II, to probe the underlying process, Alice and Bob must each implement a complete set of instruments.In our experiment, the target system is encoded in the polarization state of the photon, so Alice and Bob must act on this degree of freedom.Rather than the measurement-repreparation instruments defined in Eqs. 26 and 27, we use a slightly modified form R := Ri|(j,k) presented in the Appendix Eqs.A9 and A10.In particular, Alice and Bob each have access to three different measurement bases Mi|j where j ∈ 1, 2, 3 defines the measurement, and i defines the outcome.For each j, Mj := M1|j − M2|j is the observable associated with the POVM { M1|j , M2|j }.The specific operators we implement are defined in Eq.A5 and A6.For example, j = 1 corresponds to the Z basis: Experimentally, we implement these measurements using a polarizer fixed to transmit horizontally polarized light |H⟩.We set the measurement basis using a quarter waveplate and a half waveplate before the polarizer to the angles given in Eq.A6.To implement the second part of the instrument-the repreparations-we must prepare one of four different states.We experimentally accomplish this using another quarter-and half-waveplate to rotate the photon's polarization if it is transmitted through the polarizer.This allows us to prepare one of the four φk states listed in Eq.A8.Thus, overall, both parties can implement the 24 different measurement-repreparation operators defined in Eqs.

A9 and A10 (6 different measurement operators times 4 different repreparations)
In addition to Alice and Bob's channels, we must send in a complete set of target states, and perform measurements on the control qubit after the SWITCH.To this end, we first prepare the target qubit in the four different input states ψw given in Eq.A4.We set these states using the quarter-half waveplate pair mounted in the target preparation stage, shown in the orange area of Fig. 4a (the exact waveplate angles that we use are listed in Eq.A4).
Finally, at the output of the SWITCH we must measure the state of the control qubit.This procedure is illustrated in Fig. 3 panels b) and c).As discussed in Sec.III A, we use the same beamsplitter to measure and prepare the time-bin qubit, but from opposite directions.As a result, the phase of this measurement basis is fixed to the Y basis.In our notation in the Appendix Eqs.A1 and A2, this corresponds to a measurement C1|2 and C2|2 .Experimentally, a C1|2 versus C2|2 result depends on which port of the FDC the photon exits.As described above, we can additionally measure in the Z basis by fixing the UFOS-MZI to the bar state on the return trip such that the short and long time-bins do not interfere at the TDC and observing the arrival time of the timebins.If we find the photon arrives earlier, this is associated with an C1|1 detection event, while if it arrives later corresponds to C2|1 .
In order to be complete on the future control space, we would require an additional measurement of the con-trol qubit in the X basis, i.e. we need the measurements M 1|2 and M 2|2 from Eq. 22.In our experiment, this could be achieved using a fast phase modulator to apply the appropriate phase between the short and long path only on the reverse direction.However, we do not implement this here.Instead, we impose an additional constraint on our tomographic reconstruction.We require that Tr W exp X F = 0; where X = |+⟩⟨+|−|−⟩⟨−|.Given the passive phase stability of our experiment, this is a very good assumption.We verify this assumption, by comparing reconstructions with and without this constraint.In particular, we find that the fidelity between the process matrices reconstructed with and without this constrain is 0.999982, well below our experimental error.
Overall, this results in 24 × 24 × 4 × 4 = 9216 setting operators of the form given in Eq.A11 (number of Alice's settings × number of Bob's settings × number of target states × number of control measurements).However, for the control measurements, we have access to both ports of the FDC beamsplitter simultaneously (i.e there is a detector in each output port of the beamsplitter), giving rise to 4608 different experimental configurations.From these data we can calculate the probabilities for each given setting operator.Experimentally, we measure count rates associated with each setting operator, which we must then normalize to convert into the required probabilities.To do so, we make use of the normalization condition over the outcomes of all three measurements abc p(abc|xyzw) = 1. (30) Thus, we define a normalization constant for every value of x, y, z, and w where C(abc|xyzw) are the number of coincidences measured between the heralding detector and the detectors after the FDC, corresponding to the setting operator defined by a, b, c, x, y, z, and w.Then our experimentally estimated probabilities are defined as A small subset of the resulting probabilities are plotted in Fig. 5.Then, by minimizing Eq. 29 (with the setting operators S abc|xyzw replaced by the experimental setting operators Sabc|xyzw from Eq. A11) we can reconstruct the process matrix W exp .Our MATLAB code implementing this minimization is available at [79].
IV. RESULTS

A. Fidelity
The experimentally obtained 64 × 64 process matrix and the ideal matrix are plotted in Fig. 6 b) the imaginary part.The color gradient along the x-axis does not have a physical meaning; rather, it is color coded in order to identify the individual elements of this 64x64 matrix better.Additionally, the ideal process matrix is represented via the layer of semi-transparent bars.Calculating the fidelity between these two process matrices results in F (Wexp, W y− s ) = 0.920 ± 0.001.
chart, where panel a) shows the real part, and panel b) the imaginary part.The solid bars are the experimentally reconstructed process matrix, while the transparent bars are the theoretical process matrix W y− s .The x and y axis numerically label the basis elements.The relatively close agreement between the target process matrix and our experimental process matrix is already evident in this figure .To further assess the agreement between our experiment and theory, we estimate the fidelity of the measured process matrix W exp to the ideal matrix W y− s .Since every valid process matrix normalized by its trace is a valid quantum state, we use the conventional expression for calculating the fidelity F (σ, ρ) = Tr √ σρ √ σ with σ and ρ being different density matrices [70].This results in a fidelity of where the error arises is estimated using a Monte Carlo simulation of the entire reconstruction procedure accounting for finite measurement statistics and small waveplate errors of 1 • .Especially given the highdimension of our process matrix, this fidelity indicates that our experiment is quite close to theory.
To quantify the agreement between our experimental data and W exp we compare the residuals of our fit r (defined in Eq. 29) to the average statistical error of our data η stat .The residuals r can be interpreted as the disagreement between the outcome predicted by W exp and the measured experimental outcome, averaged over all measurement settings.For our fit r = 0.0089, indicating an excellent match to our experimental data.We estimate our statistical errors as follows.First, we treat the probability to obtain an outcome abc as a binomial variable: either we detect a photon or we do not.Then we estimate the variance of that setting as , where N xyzw is the number of photons detected in all outcomes associated with xyzw (defined in Eq. 31).Finally, we compute the average error per setting as This is simply the standard error of each setting operator averaged over all settings.Evaluating this for our data, we find η stat = 0.0056.Given that η stat ≈ r, we conclude that our process matrix fits our data well.

B. Causal Non-Separability
A bipartite process matrix without a common past is causally non-separable when it cannot be written as a classical mixture of causally ordered processes [6,13].When considering bipartite processes with a common past, such as the quantum switch considered in this work, there are different non-equivalent definitions of indefinite causality.In Ref. [13,43], a bipartite process matrix W ∈ L (H P ⊗ H Ain ⊗ H Aout ⊗ H Bin ⊗ H Bout ⊗ H F ) with common past and common future is said to be causally separable if it can be written as a convex sum of causally-ordered process matrices.That is, if we can write where p ∈ [0, 1], and W A>B and W B>A are causallyordered process (objects also referred to as quantum combs [4,8], see Appendix C).Alternatively, Ref. [11] proposes the concept of extensible causally-separable processes.This leads to a definition which differs from the one in Eq. ( 35), but is equivalent to the definition of causal-separability presented in Ref. [84], which considers not only convex mixtures of causally-ordered processes, but also incoherent (hence, classical) control of causal order.The analysis and the numbers presented in this section and in the main part of this paper were obtained exp,all W y− s −0.5834 −0.5525 −0.5834 −0.5512 Wexp −0.387 ± 0.003 −0.431 ± 0.003 −0.370 ± 0.003 −0.431 ± 0.002 Table I.Generalized Robustness Witness Analysis.A summary of the different generalized robustness witnesses constructed.The witnesses Gi,j are labelled by two subscripts.The first indicates if the witness was designed for the ideal process matrix W y− (subscript "y−") or the experimental process matrix Wexp (subscript "exp").The second subscript indicates if the restricted measurement set (subscript "res") or the complete measurement set (subscript "all") was used for the witness.The first row shows the value of the witness for W y− , and the second row shows the experimental values, which were evaluated as Tr [Gi,jWexp].
Table II.White Noise Witness Analysis.A summary of the different white noise witnesses constructed.The witnesses Gi,j and process matrix labels are labelled by two subscripts described in the caption of Tab.I.
via the definition of [43], which is the one presented in Eq. (35).However, we stress that the results of our work are not qualitatively affected by the different beforementioned definitions, in the sense that, in all cases, the process we obtain after tomography is not causally separable, and it is robust against different kinds of noise.In Appendix C, we present a more detailed discussion of such definitions and how they make small quantitative changes in the numbers presented here.
One method to quantify the degree to which our quantum process is causally non-separable is by using a causal witness.A causal witness is a Hermitian non-negative operator G such that Tr (GW sep ) ≥ 0 for all causallyseparable processes.However, for all causally-nonseparable processes (such as the quantum SWITCH) one can always find a witness G such that Tr (GW y− s ) < 0. Without additional constraints, the quantity Tr (GW ) does not have a physical meaning, and may be artificially inflated by multiplying the witness G by some constant.However, by setting additional normalisation constraints on the witness G, one may identify the quantity Tr (GW ) with how much noise the process W can tolerate until it becomes causality separable.More concretely, let 1 W := be the "white noise process", which simply consists of discarding everything and outputting white noise, and let W sep be an arbitrary causally-separable process matrix.Ref. [13] shows that the problem min Tr(GW ), is equivalent to its dual formulation Hence, under the normalisation constraint Tr(G) ≤ Tr(1 W ), we ensure the identity Tr(GW ) = −r, which allows us to interpret Tr(GW ) as how robust W is against white noise.
Alternatively, one may also consider the normalisation Tr(GΩ) ≤ 1, where Ω is an arbitrary process matrix.In this case, the equivalent problem reads as where Ω is an arbitrary process matrix.In this case, the value Tr(GW ) = −r is typically called the "generalised robustness"; it may be viewed as the amount of noise one needs to add to W to make it causally separable in the worst case scenario.
In order for the witness G to fit the setting operators implemented in our experiment, we impose an additional structure on the witness G which is given by where α a,b,c,x,y,z,w are arbitrary real numbers and S abc|xyzw are the setting operators of our experiment (see Sec. II E).Additionally, for fixed setting operators, finding the maximal violation of a witness G with the normalization constraints related to white and generalised noise is a Semidefinite Program (SDP) [13], and can be efficiently solved numerically [85].Our code doing so is also available at [79].
With these tools, we can construct a variety of witnesses.First, we can construct witnesses using the complete measurement set (Eq. 28) or our restricted measurement set (Eq. A11).We can further design witnesses for two different process matrices W y− s or W exp .This results in four witnesses: G y-,all , G y-, res , G exp,all , G exp, res .Where the subscript y− (exp) indicates that the witness was optimized for the ideal (experimental) process matrix, and the subscript all (res) indicates that the witness was computed using the complete (restricted) measurement set.We can then further construct witness for either the generalized or white noise robustness.
The results of the generalized robustness witnesses are summarized in Tab.I.The first row of Tab.I shows the value of the four witnesses evaluated using W y− s , and the second row shows the experimental values, estimated using W exp .In this case, we see that Tr W y− G GR In other words, the generalized robustness evaluated either with the complete or restricted setting operators is equal within experimental error.Evidently, the additional measurement on the future control system does not affect the generalized robustness.More interesting for the generalized robustness witnesses, is the performance of the witnesses optimized for our experimental process matrix W exp .Examining the performance of our experimentally estimated witnesses (the bottom row of Tab.I), we see that the witnesses designed specifically for our experimental process matrix increase the generalized robustness.In particular, Tr W exp G GR exp,all > Tr W exp G GR y-,all and Tr W exp G GR exp,res > Tr W exp G GR y-,res .This would not be readily possible without performing process matrix tomography.
In Tab.II we summarize the results of our white noise witness analysis.In this case, we see a significant difference between the witnesses constructed with the restricted and complete measurement sets, with the complete measurements sets revealing a higher white noise robustness in all cases.Furthermore, in the second row, we can see that each step progressively improves the experimental white noise robustness.The first entry Tr G WN y-,res W exp = −1.65 ± 0.02 shows the value that one would obtain for our setup without performing process matrix tomography.i.e. the witness was designed for the ideal process and uses only the experimentally implementable measurement settings.In the next column, Tr G WN exp,res W exp = −1.76± 0.01 is improved by tailoring the witness for our experiment; however, still using only experimentally implementable measurements.In the next two columns, we improve both of these values further by computing the witness assuming the complete measurement set.The final entry, Tr G WN exp,all W exp = −2.11± 0.02 is significantly higher than the first entry, clearly showing the power of full process matrix tomography.Process matrix tomography allows us to compute properties of the experimental process, without having direct experimental access to them and we can precisely tailor our analysis to our experimental conditions.In the Appendix tables III and IV, we show the same analysis, for the alternative definition of causal non-separability.The trends observed therein .Worst-Case Process Tomography.Plots of the worst-case generalized robustness (a) and white noise robustness (b) causal witnesses versus the allowed deviation from the experimental data.For these plots, the tomography routine attempted to find a process matrix which maximized the witness (i.e. it searched for the "most causally separable" process matrix), while still agreeing with our experimental data with an average error of ϵ.For all witnesses considered, the process which minimizes ϵ is also the most causally non-separable.For ϵ < 0.089 no valid process matrix is found.
are the same, although the absolute values of the robustnesses are smaller.

C. Worst-Case Process Tomography
For the process tomography results presented in Sec.IV A, we found the process matrix that fit to our data best, by minimizing Eq. 29.As discussed there, this resulted in a process matrix that describes our data very well.However, one could ask "Are there other causallyseparable process matrices that describe the data almost as well?"To answer this question, we perform a "worstcase" version of our process matrix tomography.To do so, rather than minimizing expression Eq. 29, we find the process matrix that maximizes the generalized or white noise robustness, Tr W worst G GR j,all or Tr W worst G WN j,all , respectively.We do this using the witnesses designed for the ideal process matrix and the originally reconstructed experimental process matrix, but always with the complete measurement sets.This maximisation is subject to the constraints that W worst is a physical process matrix and that the predictions of W worst match the experiment within some error ϵ: abcxyzw Tr S abc|xyzw W − p exp (abc|xyzw) ≤ ϵ. (44) Here ϵ is closely related to the residuals defined in Eq. 29.
In particular, if ϵ < r the maximization will fail, as there is no physical process matrix compatible with this constraint.Thus, we perform worst case process tomography for the four witnesses discussed above starting from ϵ = r exp = 0.0089 and increasing to ϵ = 0.015.The results of this analysis are plotted in Fig. 7a and b.We find that the generalized robustness witnesses are more tolerant ϵ, finding that our data is only consistent with causally separable process matrices for ϵ ≲ 0.012, while the white noise witnesses require ϵ ≲ 0.0105.Although this analysis suggests that the causal non-separability is rather fragile, we stress the worst-case nature of this treatment: if a single causally separable process matrix is compatible to our data within ϵ it will be returned, even if a causally non-separable data fits our data better.In any case, we see that for a range of experimentally relevant errors our data are only compatible with a causally non-separable process matrix.

V. DISCUSSION
In this work, we have presented a protocol to perform process tomography on a higher-order quantum operation, the quantum SWITCH.We discussed how to construct a complete set of measurements.The requirements for this go beyond standard quantum process tomography, wherein one must "only" send a complete set of input states through the process, and perform a complete set of measurements after the process.In particular, because HOQOs take quantum channels as inputs, we must also implement a complete set of quantum channels for each input channel.This can be achieved using measure-andreprepare instruments.Since this procedure scales even worse than standard process tomography, we implement it using a new phase-stable architecture of the quantum SWITCH, allowing for long integration times.
Our photonic quantum SWITCH uses a time-bin qubit as the control system.By recombining time-bin qubits using a passively-stable interferometer, we were able to keep our experiment stable indefinitely.We believe this technique will be beneficial for various time-bin quantum information experiments and may even be scaleable to high dimensional time-bin qudits.This would enable the construction of a multi-party quantum SWITCH.The results of performing quantum process matrix tomography on our experiment show that we have indeed implemented a high-fidelity quantum SWITCH, with a fidelity of F = 0.920 ± 0.001.We then used our results to verify the causal non-separability of our experiment, designing causal witnesses specifically for our experimental process matrix.Finally, we implemented a worst-case process tomography, searching for a causally-separable process that could also describe our measurement.To find such a process, we had to allow for a ≈ 1.5 times larger disagreement between our measurements and our causally separable model.
Although our protocol was presented for the quantum SWITCH, it could be adapted to general HOQOs in a straight-forward manner.In our present work, we performed an over complete set of measurements, but it should be possible to implement a reduced set of measurements by taking into account the constraints on the space of physical process matrices.We also point out that many complexity-reducing techniques from standard state and process tomography, including compressed sensing [86], shadow tomography [87], adaptive tomography [88], etc., should apply to our protocol equally well.But we leave these as topics for future work.
In order to ensure that the quantum SWITCH performs the desired transformations on the photon's polarization, it is important to correct for the birefringent behaviour of the fibers that connect Alice and Bob's laboratories; i.e. to ensure that the fibers do not change the polarization state of the photon.Hence, each fiber link has to perform an identity operation.To this end, each fiber is equipped with a 3-loop fiber polarization controller, which allows us to implement any unitary polarization transformation in the fiber.In order to implement a true identity operation, we must check that the correct transformation is applied in two different bases.We use the computational and diagonal bases.A convenient way to do this, is to send classical light at the same wavelength as the single photons through the fibers and detect the polarization with a polarimeter at the fiber output.To ensure the identity transformation, we switch the light's polarization state at the target preparation stage(see Fig. 4) between horizontal |H⟩ and diagonal |D⟩, while adjusting the polarization controller until it converges to the correct setting in both bases.To correct the polarization for the second trip through the channels, we place a polarizer in one of the channels (without additional waveplates).This decouples the compensation from the previous fiber.Then, we follow the same procedure and alternate the polarizer to transmit |H⟩ and then |V ⟩.In order for this procedure to work properly, it is essential that wavelengths of the classical light and the single photons is matched.

Experimental Instruments and Measurements
Here we explain in detail the measurements and instruments we implement in the lab, and how they relate the ideal settings discussed in the main text.Experimentally, the measurement outcomes c = 1 and c = 2 correspond to finding the photon exiting different ports of the beamsplitter (labeled FDC in Fig. 4 ).Note that the order of the output indices has swapped compared to Eqs. 21-23 in order to be consistent with our experimental convention.

b. Past Target States
Our target system is encoded as in the polarization degree of freedom of the photon.We thus prepare its state by sending the photon to a polarizer set to transmit horizontal polarization, which we define to be the logical |0⟩ state.We then set its state using a quarter waveplate, followed by a half waveplate.We can thus prepare the set of states given by S := { ψw ψw } 4 w=1 , (A3) where QWP HWP w ψw c. Alice and Bob's Instruments As our target system is encoded in a polarization state, Alice and Bob must implement measure and re-prepare channels on this degree of freedom.To do the measurement, they use a fixed polarizer to project onto horizontal polarization.Using a quarter and a half waveplate before the polarizer, they can then set the measurement basis.Since this only provides one outcome (either the photon is transmitted or not, but they cannot detect the cases when a photon is absorbed by the polarizer) they must also explicitly set the waveplates to perform the orthogonal measurement in order to normalize the data to compute a probability.These measurements defined by the set where the waveplates angles and resulting measurement operators are given by QWP HWP i j Mi|j  III.Generalized Robustness Alternative Definition Witness Analysis.A summary of the different generalized robustness witnesses constructed using the definition of [37], presented in Eq. (C8).The witnesses Gi,j are labelled by two subscripts.The first indicates if the witness was designed for the ideal process matrix W y− (subscript "y−") or the experimental process matrix Wexp (subscript "exp").The second subscript indicates if the restricted measurement set (subscript "res") or the complete measurement set (subscript "all") was used for the witness.The first row shows the value of the witness for W y− , and the second row shows the experimental values, which were evaluated as Tr [Gi,jWexp].IV. White Noise Alternative Definition Witness Analysis.A summary of the different white noise witnesses constructed using the definition of [37], presented in Eq. (C8).The witnesses Gi,j and process matrix labels are labelled by two subscripts described in the caption of Tab.III.
ordered from A to B if it can be constructed by beans of a sequential quantum circuit [6,13], also called a quantum comb [2,8] (C6) In the main part of this work, we follow the definition of Ref. [43], where a bipartite process matrix W ∈ L (H P ⊗ H Ain ⊗ H Aout ⊗ H Bin ⊗ H Bout ⊗ H F ) with common past and common future is said to be causally separable if it can be written as a convex sum of causally-ordered process matrices.That is, if we can write where p ∈ [0, 1], and W A>B and W B>A are causallyordered process.However, as discussed earlier, there exists a non-equivalent definition, which go beyond simple convex combination and allow incoherent classical control of causal orders.This definition is presented in Ref. [84], and it is proven to be equivalent to the notion of extensible causal, presented in Ref. [11].In the bipartite scenario with a common past and common future, Ref. [84] states that a process matrix W ∈ L (H P ⊗ H Ain ⊗ H Aout ⊗ H Bin ⊗ H Bout ⊗ H F ) is causally separable when there exists causally ordered processes W A>B and W B>A such that, Notice that the definition of Ref. [84] presented in Eq.C8 is more relaxed than the one of Ref. [43], and presented in (C7).Indeed, as we show latter, there are processes which are causally separable following the definition of [84], but are causally non-separable following the definition of [43].
In the definition of Ref. [84], in order for a causal witness to be valid, one should include an extra constraint, which reads as We can then re-calculate the numbers presented in Table I and Table II from the main text, but following the definition of Ref. [84].These results are presented in Table III and Table IV, and we notice that although there are some difference in the obtained numbers, the qualitative result remains the same.

Figure 1 .
Figure 1.The quantum SWITCH.Panel a) (b)) shows the causally ordered process with the control in state |0⟩ C (|1⟩ C ),where Alice (Bob) acts before Bob (Alice) on the target system.c) A superposition of orders with the control qubit in the state1 Ain ⊗ H Aout ) are the Choi operators of an instrument on Alice's space.4. B b ∈ L(H Bin ⊗ H Bout ) are the Choi operators of an instrument on Bob's space.

Figure 3 .
Figure3.Active generation and measurement of time-bin qubits.a) In this row, an incident single photon is deterministically placed in a superposition of the "short" and "long" time bins, using an ultra-fast optical switch (UFOS).In order to achieve passive phase stability, the measurements, shown in rows b) and c), are actually implemented using the same device as in row a).However, for clarity, we have mirrored the device horizontally.b) This row indicates our measurements in the Z-basis.Here, the UFOS remains in the "bar state".After traversing the device, the photon remains in a superposition of the two incident time bins (but spread over two paths).In this situation, simply resolving the arrival time of the photon projects into the Z-basis.c) A schematic of our deterministic measurement in the Y-basis.Here the UFOS alternate between the "cross" and bar states so that the short (long) time bin now takes the long (short) path.In this manner, the two time bins interfere on the beamsplitter, so that finding the photon in the upper (lower) path corresponds to projecting the time bin onto |y+⟩ (|y−⟩).

Figure 4 .
Figure 4. Experimental setup.a) The complete experiment.The individual sections are indicated with colors.The red sectionshows the Sagnac SPDC source that generates heralded single photons.In the yellow section, we illustrate the asymmetric MZI to generate and measure the time-bin control qubit.The orange section shows the target qubit preparation stage, which consists of a PBS and two waveplates.The green area hosts the fiber-based quantum SWITCH.The heralded and heralding photons are detected using SNSPDs, shown in the blue area.By triggering off of the detection event of a heralding photon we use a pulse generator to control the optical switches in the setup.The sub-panels i) -vi) in panel b) show the functionality of the quantum SWITCH.By controlling the state of the optical switches, we route the two time-bins in different orders through Alice and Bob's quantum channels.After the SWITCH operation, the target qubit has experienced the action of the quantum channels in a different order depending on the state of the time-bin qubit.

Figure 5 .
Figure 5. Experimentally Estimated Probabilities.A small subset of the experimentally estimated probabilities.The bars are the theory for the ideal process matrix W y− , and the points are the experimental estimates.The upper (lower) panel shows measurements of the control qubit in the Z (Y ) basis.The red bars are for outcomes |0⟩ and |y−⟩, and the blue for outcomes |1⟩ and |y+⟩.

Figure 6 .
Figure 6.Process tomography data.This figure shows the experimentally recreated process matrix of the quantum SWITCH.a) represents the real part of WSWITCH andb) the imaginary part.The color gradient along the x-axis does not have a physical meaning; rather, it is color coded in order to identify the individual elements of this 64x64 matrix better.Additionally, the ideal process matrix is represented via the layer of semi-transparent bars.Calculating the fidelity between these two process matrices results in F (Wexp, W y− s ) = 0.920 ± 0.001.
y-,res ≈ Tr W y− G GR y-,all and Tr W exp G GR exp,res ≈ Tr W exp G GR exp,all .

Figure 7
Figure7.Worst-Case Process Tomography.Plots of the worst-case generalized robustness (a) and white noise robustness (b) causal witnesses versus the allowed deviation from the experimental data.For these plots, the tomography routine attempted to find a process matrix which maximized the witness (i.e. it searched for the "most causally separable" process matrix), while still agreeing with our experimental data with an average error of ϵ.For all witnesses considered, the process which minimizes ϵ is also the most causally non-separable.For ϵ < 0.089 no valid process matrix is found.
a. Future Control MeasurementOur control qubit is the time-bin qubit.Its past state is fixed to |y−⟩.Ideally, we would measure the future control in three different bases described by Eqs.21 -23.However, due to experimental limitations we only measure the future control in two basesC := { Cc|z } c=2,z=2 c=1,z=1 ,

45 • 22 . 5 • 1 2 |+⟩⟨+| 45 • 67 . 5 • 2 2 |
their measurements, Alice and Bob reprepare the target state in one of 4 different options P := { φk φk } 4 k=1 .Wexp −0.346 ± 0.003 −0.361 ± 0.003 −0.323 ± 0.003 −0.363 ± 0.003 Table Wexp −0.572 ± 0.005 −0.614 ± 0.006 −0.689 ± 0.007 −0.739 ± 0.008 Table ).More explicitly, a process matrixW A>B ∈ L (H P ⊗ H Ain ⊗ H Aout ⊗ H Bin ⊗ H Bout ⊗ H F ) is causally ordered from A to B if it respects tr F (W A>B ) = tr B O F (W A>B ) ⊗ 1 B O d B O (C1) tr B I B O F (W A>B ) = tr A O B I B O F (W A>B ) ⊗ 1 A O d A O (C2) tr A I A O B I B O F (W A>B ) = tr P A I A O B I B O F (W A>B ) ⊗ B>A ∈ L (H P ⊗ H Ain ⊗ H Aout ⊗ H Bin ⊗ H Bout ⊗ H F ) is causally ordered from B to A if it respects tr F (W B>A ) = tr A O F (W B>A ) ⊗