Coherent control of the causal order of entanglement distillation

Indefinite causal order is an evolving field with potential involvement in quantum technologies. Here we propose and study one possible scenario of practical application in quantum communication: a compound entanglement distillation protocol that features two steps of a basic distillation protocol applied in a coherent superposition of two causal orders. This is achieved by using one faulty entangled pair to control-swap two others before a fourth pair is combined with the two swapped ones consecutively. As a result, the protocol distills the four faulty entangled states into one of a higher fidelity. Our protocol has a higher fidelity of distillation and probability of success for some input faulty pairs than conventional concatenations of the basic protocol that follow a definite distillation order. Our proposal shows the advantage of indefinite causal order in an application setting consistent with the requirements of quantum communication.


I. INTRODUCTION
Causality is a fundamental concept in nature and deeply embedded in the traditional model of computation.A computing algorithm, classical or quantum, usually envisions a target system undergoing a series of gates in a fixed causal order.Recent studies [1] revealed that under the assumption that quantum mechanics is valid locally, events can happen in an indefinite causal order.This concept was then extended to a quantum computation model with an indefinite causal structure, which features quantum operations occurring in an indefinite causal order [2,3].
Significant effort has been made to look for specific communication and computational tasks where indefinite causal structures provide advantage.On the computation side, there are specific classes of problems such as Fourier [4,5] and Hadamard promise problems [6,7] that have been shown to enjoy a reduction in query complexity when quantum gates were queried through an indefinite causal order.The advantage of such a setup in the task of solving the generalized Deutsch's problem has also been explored [8].On the communication side, some probabilistic communication tasks were found to result in a boost in success probability [9] and reduction in communication complexity [10] when passage orders of information through communication parties were put into a superposition.It was also discovered that putting two noisy channels through a superposition of passage orders reduces the noise of communication and for some specific channels results in complete noise removal [11][12][13][14][15].The above exam-ples of indefinite causal order all arise under the setup that two events, each described by a completely positive trace nonincreasing (CPNTI) map, are passed through by the target system in an order coherently controlled by another qubit.Mathematically, the advantage of putting CPNTI maps in an indefinite causal order arises from the nonzero commutation of Kraus operators that make up each quantum map [12,16,17].
A natural question to ask is whether indefinite causal structures are useful in any current existing or near-term quantum information processing tasks.Studies that reduce communication noise by superposing the passage orders of noisy channels [11][12][13][14][15] have assumed the message making a noiseless return to the front of the other channels after passing through and exiting from the end of the other one, which is inconsistent with realistic situations of quantum communication.On the computation side, information-theoretic tasks that were studied in previous works [4][5][6][7]9,10] cannot be easily generalized to solving more common computational problems, leaving significant research effort still required to bridge the gap.A recent work [8] experimentally demonstrated solving the generalized Deutsch's problem with an indefinite causal order setup, but such a proposal is only likely to be useful in the long term with fault-tolerant quantum computers that can solve this problem on a large scale with large inputs.Fault-tolerant quantum computers are of immense experimental challenge to build, which prevents such proposals to be used in the short term.Some other examples [18][19][20] considered applying two quantum teleportation steps in an indefinite causal order by coherently swapping the two involved entangled states to reduce noise on the teleported state caused by imperfection of the entangled states upon their generation.However, as we show in the Appendix of the paper, their proposal does not achieve noise reduction.We also argue that in the more general case where noise occurs during the process of distribution of entanglement, swapping of the entangled states must be carried out remotely.This requires extra entanglement which brings difficulties to their proposal.It therefore remains largely unknown how useful indefinite causal structures are in near-term computation and communication tasks.
In this paper, we study the entanglement distillation of four entangled pairs and show that distillation of entanglement, which is a basic and necessary task in quantum communication, benefits from indefinite causal structure.More specifically, it allows the production of higher-fidelity entangled pairs than merely carrying out distillation steps in a definite causal order.We consider the well-known entanglement distillation protocol proposed by Deutsch et.al.[21] (to be referred to as the DEJMPS protocol) which turns two faulty entangled pairs into one of a higher fidelity.When three faulty pairs, χ i where i ∈ 1, 2, 3, are subject to such a protocol, χ 1 can be combined with χ 2 before the distillation product is combined with pair χ 3 , or that χ 1 can be merged with χ 3 then χ 2 .This defines an "order" of entanglement distillation.We describe a protocol where the two orders are put into a coherent superposition and show that like other previously studied cases of indefinite causal order, the advantage in fidelity also comes from nontrivial commutation of Kraus operators of the CP map that describes the entanglement distillation.We then show the practical advantage of our protocol by presenting that for some input faulty entangled states, characterized by amount of mixture of the four Bell states, our protocol results in a higher fidelity and/or success probability than merely putting two distillation steps in a definite causal order.Given the known connection between entanglement distillation protocols and quantum error correction codes [22], we hope this work will stimulate effort into looking for the advantage of indefinite causal structures in quantum error correction.
This paper is organized as follows.In Sec.II, we review the basic principles of recent applications of indefinite causal order in the form of a quantum switch.In Sec.III, we first review two-way entanglement distillation protocols constructed from small error-detecting codes.This includes the DEJMPS protocol and a protocol using three entangled pairs, the latter of which is constructed from three-bit quantum error detecting code.We then present in Sec.IV a naively modified protocol which performs two DEJMPS distillation steps in an superposed causal order.As the circuit of the proposed protocol is quite complicated, we then describe in Sec.V a simplified protocol which we show still simulates two DEJMPS distillation steps in superposed causal orders.In Sec.VI we argue for our protocol by presenting the parameter regions of the input faulty states where our scheme shows an advantage over concatenations of the DEJMPS protocol and the three-pair protocol that follow a definite causal order.Some conclusions are then provided in Sec.VII.

II. QUANTUM SWITCH AND KRAUS OPERATORS OF COMPLETELY POSITIVE MAPS
A common framework to realize indefinite causal order between two quantum operations on a target state is to use an additional qubit to control the orders of occurrences of operations, such that the different orders are correlated with different basis states of the control qubit.Such a setup is named a "quantum switch" in the literature.Here, we give a short review of the basic principles of proposed applications of the quantum switch.Quantum operations are mathematically described as completely positive (CP) tracenonincreasing maps.When two quantum operations M = i M i ρM † i and N = j N j ρN † j , where M i and N j are Kraus operators (a special case is when M = MρM and N = NρN are unitary operators), are put into a quantum switch controlled by a qubit ρ c , the overall map on the control and target states reads where the overall set of Kraus operators reads Previous works that studied using a quantum switch [23] to boost the communication capacity of noisy channels [11][12][13][14]24] often initialize the control state as ρ c = |+ +|, an even superposition of passage orders through the two quantum maps giving a resulting state ρ c is then measured in the Fourier ({|+ , |− }) basis and the state correlated with the |+ measurement outcome is postselected.The benefit of having a quantum switch is due to nontrivial commutations of the set of Kraus operators {M j } and {N i } ([M j , N i ] = 0 for some i, j).To see why this is the case, suppose the opposite is true, i.e. [M j , N i ] = 0 for all i, j, then (2) is equal to |+ +| ⊗ i j M j N i ρN † i M † j , which is simply the case of ρ passing through two channels in a definite causal order.

III. ENTANGLEMENT DISTILLATION PROTOCOLS
Quantum entanglement is a useful resource that is extensively utilized in many quantum technologies, such as quantum clock synchronization [25], device-independent quantum key distribution [26,27], quantum metrology [28], and distributed quantum computing [29].In the above application settings, constituent particles of the entanglement are usually generated via local physical processes which result in quantum correlations between degrees of freedom of the constituent particles.The particles are then shared between spatially separated parties by being sent through communication channels.In practice, the communication channels are often noisy which degrades the quality of entanglement.In this paper, we focus on the case where entanglement occurs between qubits.A common noise model results in the shared faulty pairs (which we denote as χ i ) being mixtures of Bell states [30]: where Alternatively, it is compactly expressed as a column vector: where T denotes the vector transpose.
We define the fidelity of χ i as When Entanglement distillation is a basic protocol that seeks to improve the quality of faulty entangled pairs distributed via noisy processes.In this paper, we focus on two-way entanglement protocols [22].This type of protocols involves local unitary operations, local measurements, two-way classical communication of measurement results between the communication parties and postselections.It is known that there is a correspondence between two-way entanglement distillation protocols and stabilizer quantum error detecting codes [31].As a result, the former are usually constructed from the latter.A stabilizer error-detection protocol involves the sender encoding a logical state into a larger Hilbert space using extra ancillas before sending all physical states to the receiver via a noisy channel.The receiver decodes the logical state by measuring the stabilizer of the code.The decoded logical state is kept if the obtained error syndrome signals no error on the decoded state, or discarded if otherwise.In the corresponding two-way entanglement distillation protocol, multiple entangled pairs are shared between the two parties.The receiver performs the decoding circuit of the error-detecting code and the sender performs the complex conjugate of the decoding circuit.Syndrome measurements of the code are then performed by both parties.If the pre-shared entanglements are perfect | + = 1/ √ 2(|00 + |11 ), then the syndrome measurements on both parties are perfectly correlated.Errors on the shared entanglement result in a finite amount of | − , | + and | − as mixture components, which cause imperfect correlations.Nevertheless, with the presence of errors there is still a finite probability of obtaining the same syndromes on both sides, which projects the unmeasured state into a less-erroneous subspace, or equivalently an entangled state of higher fidelity.We now give a review of various two-way entanglement distillation protocols constructed from small error-detecting codes.These protocols distill a small number of faulty entangled states into one entangled state with a higher fidelity.

A. DEJMPS protocol
The DEJMPS protocol [21] is an early proposed entanglement distillation protocol that turns two entangled pairs into one.We denote the density matrices of the two entangled pairs as χ 0 and χ 1 .Their Hilbert spaces are denoted as H 0 = H 0A ⊗ H 0B , where Hilbert space H 0A contains the state of the particle of χ 0 held by one party called Alice and H 0B contains that held by the other party called Bob.Likewise, we define FIG. 1.The circuit for the DEJMPS entanglement distillation protocol.χ 1 , χ 2 denote density matrices of the two shared entangled pairs.χ i has its two particles sitting in Hilbert spaces H iA held by Alice and H iB held by Bob.
H 1 = H 1A ⊗ H 1B for the two particles of χ 1 .The protocol has a circuit shown in Fig. 1 and is described as follows.
(1) Alice performs a rotation R = exp(−i π 2 X ) on the particles on her side.Bob performs the rotation R † = exp(+i π 2 X ) on the particles on his side.
(2) Alice and Bob each performs a CNOT between the two particles on their sides respectively.The two CNOTs are controlled by (and also targetting) qubits in the same entangled pair, (3) Alice and Bob measure the target qubits of their CNOTs in the computational basis.Alice then sends Bob her measurement result.If their results agree, the entangled state serving as control qubits in the previous CNOTs is kept.If their results differ, the control pair is discarded and they start over again.In either case, the target pair is also discarded.
The protocol can be described as a CP trace-decreasing map which can be expressed as where Kraus operators with i ∈ {00, 11}.CNOT 1A 0A is a CNOT gate with a control qubit in H 0A and target qubit in H 1A and R0A,1A = R0A ⊗ R1A where subscripts denote the Hilbert space of the operators.Equation ( 7) maps the combined input state in The summation is a mixture over the two possible measurement outcomes 00 and 11 that show even parity.
Under the assumption of χ 0 and χ 1 both being Belldiagonal states, Eq. ( 7) can also be expressed as where Ô = √ 2 Ô00 .This simplification is justified by the fact that the measurement projecting the state in H 1A ⊗ H 1B onto |11 introduces an extra "−1" global phase onto the state in the leftover Hilbert space, and that the extra phases produced on both the "bra" and "ket" side cancel out.This makes projecting onto |11 equivalent as projecting onto |00 .
In practice, the various applications which entangled pairs are subject to often request high-fidelity, which can be obtained from repeated applications of single DEJMPS steps using x > 2 faulty pairs.This can alternatively be viewed as a single faulty pair undergoing multiple steps of distillation, where each step can be defined with respect to the entangled pair that the original one is combined with.We may then define the causal order of distillation to be the order with which the original pair is combined with others.As an example, for three faulty entangled states χ i (where i ∈ {1, 2, 3}), we may distill χ 3 and χ 2 into a product of higher fidelity which is then distilled with χ 1 .The second distillation step is clearly in the causal future of the first step, as it requires the product of the first step as its input.Likewise, the first step is in the causal past of the second step.
An n-to-one distillation scheme when n > 3 has more possible arrangements of single DEJMPS steps compared with the case when n = 3.As an example, below are all the arrangements when n = 4.
(1) Select the supplied pair with the maximum fidelity and discard other three.No distillation is done.
(2) Perform one distillation step using two of the four pairs and discard the other two pairs.
(3) Select three pairs and discard the fourth pair.Within the three chosen ones, select two to perform one step of distillation, the product of which is teamed up with the third pair for another step.
(4) Group all four pairs into teams of two, where one step of distillation is carried out separately for each team before the two products are teamed up for another step.This is called a "recurrence-like structure" [32].
(5) Select two pairs to do one step of distillation, the product of which is purified by combining it with either of the two unchosen pairs before the product is combined with the last unchosen pair for a third step.This is called a "pumpinglike structure" [32,33].
For four faulty pairs χ i (where i ∈ {0, 1, 2, 3}), we denote the set which contains all the above distillation arrangements as G.
In this paper, we are concerned with the use of entangled states in near-term quantum communication tasks.The most important metrics of two-way entanglement distillation protocols are the fidelities of the output states and their probabilities of success.Apart from this, we recognize another important metric being the number of memories required to carry out the protocol.Quantum states involved in various quantum communication tasks will likely to be stored inside memorybased quantum repeaters.Although it is desired that in the long term, quantum repeaters will be based on error-correcting codes which transmit quantum states fault-tolerantly, in the near term they have fewer memories, high memory noise, infidelity of quantum gates, and high photon loss which are unable to be corrected via error-correction.Instead, they are expected to rely on heralded entanglement generation and two-way entanglement distillation to reduce errors [34,35].Having an entanglement distillation protocol that uses fewer quantum memories leaves more vacant spaces to receive incoming generated entangled pairs, hence boosts the rate of entanglement distribution.
We note that the concatenated DEJMPS protocols in set G all use four entangled pairs, but also only require maximally three memory units (for each communication party) to perform.Consider the "recurrence" structure, where the entangled pair distilled from the first DEJMPS step and two other newly generated pairs taking part in the parallel DE-JMPS step are stored.Although a total of four entangled pairs are used in this arrangement, only a maximum of three pairs need to be stored in the memories at any time.Later, in Sec.V, we will introduce a class of protocols that simulates two DEJMPS steps applied in an indefinite causal order.Such protocols also make use of four entangled pairs, but only require three entangled pairs to be stored at any time.Nevertheless, there do exists other four-entangled-state protocols that satisfy this storage requirement.We describe them as follows.

B. Three-pair distillation protocols
The DEJMPS protocol described as above is constructed from the two-bit repetition code (up to an initial single-qubit rotation R), whose codeword is stabilized by Ẑ Ẑ, where Ẑ is the Pauli Z matrix.As a result, the decoding circuit of the code (also up to the rotation R) is performed on both sides of the entangled pairs to detect errors that do not commute with the stabilizer.When >2 pairs are shared, larger entanglement distillation schemes can be constructed from larger error-detecting codes.We consider a three-qubit code which is stabilized by Ŝ1 = Î Ẑ Ẑ and Ŝ2 = X X X .Such a code has the following decoding circuit where the top measurement obtains the syndrome Ŝ2 and the bottom measurement obtains the syndrome Ŝ1 .This leads to the following three-pair entanglement distillation circuit shown in Fig. 2 for three entangled pairs χ 0 , χ 1 , and χ 2 .
We have added the same extra rotation R = exp(−i π 4 ) to be consistent with the DEJMPS protocol.When four pairs are shared, there are also the following multiple possible arrangements to distill them into one pair.We denote the set which contains all the following arrangements as J : (1) Select three pairs, arrange them in some permutation and distill them into one pair using Fig. 2's circuit.Discard the fourth pair.
(2) Select three pairs and distill them into one using Fig. 2's circuit.A further DEJMPS step is performed on the distilled product and the fourth pair.
(3) Select two pairs, distill them into one pair using DE-JMPS protocol.The product and the two remaining pairs are distilled into one using Fig. 2's circuit.

IV. APPLYING TWO DEJMPS STEPS IN A SUPERPOSITION OF CAUSAL ORDERS
All protocols P ∈ G and P ∈ J are examples of distillation steps carried out in a definite causal order, where the causal orders between different steps are well defined.In this section, we introduce our modification to break this causality.

A. Circuit that coherently controls the order of two DEJMPS steps
We consider the following process: the four Bell-diagonal faulty pairs χ i are shared between Alice and Bob as illustrated in Fig. 3.Under some control system being in a logical state |0 , χ 3 undergoes a DEJMPS step with χ 2 first, whose dis-tillation output then undergoes another DEJMPS step with χ 1 .When the control system is in |1 orthonormal to |0 , χ 3 is routed in the quantum registers to first combine with χ 1 , whose distillation output is then combined with χ 2 .We let the logical states |0 (|1 ) be encoded in the entangled pair χ 0 as |00 (|11 ) and refer to χ 0 as the "control pair" of the protocol.The above process then can be realized by part of the circuit shown in Fig. 4 before the vertical dashed line.In Fig. 4, each entangled pair χ i is inside Hilbert space H iA ⊗ H iB .We denote ρ in = χ 1 ⊗ χ 2 ⊗ χ 3 .One can show before the vertical dashed line, the overall state in Hilbert space H 0A ⊗ H 0B ⊗ H 1A ⊗ H 1B , which consists of the control pair and the bipartite state distilled from the two DEJMPS steps, can be expressed as where where SWAP 2A 3A is a SWAP gate between the states in Hilbert spaces H 3A and H 2A .K (1) i and K(2) j are the operations on ρ in in the case that the control pair χ 0 is in the state |01 or |10 .We do not give the explicit expression of K(1) i and K(2) j as they do not contribute to the discussion.Ê (m) i is essentially the Kraus operator of a DEJMPS step as given in Eq. ( 7) but FIG. 4. A modified entanglement distillation scheme which turns four faulty Bell-diagonal states χ i (each residing in Hilbert space H iA ⊗ H iB ) into one.The circuit features two DEJMPS distillation steps applied in a superposition of causal orders which is coherently controlled by the faulty pair χ 0 .The red and blue dashed rectangles encircle the CNOT gates of the two DEJMPS steps applied in the χ 3 → χ 1 → χ 2 (χ 3 → χ 2 → χ 1 ) causal order.After the vertical dashed line in the figure, we postselect the distilled state upon receiving an even parity measurement outcome from the state in H 0A ⊗ H 0B .This interferes the products distilled from the two causal orders with the hope of boosting the fidelity of the final product of distillation.j , the Kraus operators of the two DEJMPS steps.This clearly shows that the two distillation steps are put into an indefinite causal order.The second term in Eq. ( 10) occurs when the control pair χ 0 has Bell components | + + | and/or | − − |.This gives rise to another completely positive map which cannot be easily interpreted as χ 3 going through two DEJMPS steps in any order.
After the two DEJMPS steps, a Hadamard gate is applied onto both particles of the control pair, which is followed by a parity measurement on the control pair (shown in Fig. 4 after the vertical dashed line).We then postselect the bipartite state in the unmeasured Hilbert space H 1A ⊗ H 1B over an even parity outcome.This allows the distillation product from the two causal orders to interfere and the state in the unmeasured Hilbert space is the final product of the protocol.
One sees from Fig. 4 that the circuit is quite complicated.It also requires the control pair to coherently control every gate within the DEJMPS protocol.This gives rise to a large number of double-controlled gates between three qubits, which are difficult to implement.These factors make the circuit difficult to demonstrate as an entanglement distillation protocol.In Sec.V, we introduce a much simpler circuit that also simulates two DEJMPS steps in an indefinite causal order.In this new circuit, the causal orders of DEJMPS steps are not emulated by routing χ 3 , but merely control-SWAPPING χ 1 and χ 2 .We show that the output state under this simplified protocol can be expressed as ρ in undergoing two maps with set of Kraus operators defined in Eq. (9).The new circuit uses a smaller number of gates, especially many fewer controlled gates among three particles.These features shall make the new circuit easier to demonstrate in practice.

V. SIMPLIFIED COHERENT CONTROL OF ORDER OF TWO DEJMPS STEPS
The simplified scheme of coherently controlling the order of two DEJMPS distillation steps has a circuit shown in Fig. 5 and is described as follows.
(1) We apply a controlled-SWAP gate from the particle in Hilbert space H 0A to the two particles in Hilbert spaces H 1A and H 2A .The particle in Hilbert space H 0B performs a controlled-SWAP gate on the two particles in Hilbert spaces H 1B and H 2B .The controlled-SWAP gate is such that the two target states are swapped if the control qubit is in |1 .
(2) We apply one step of DEJMPS protocol on Hilbert spaces H 3A ⊗ H 3B and H 2A ⊗ H 2B .If the measurement outcomes show even parity, H 2A and H 2B are kept and we proceed to step 3. Otherwise we discard all pairs and start the scheme over with newly supplied faulty pairs.
(3) We apply another step of Deutsch's protocol on Hilbert spaces H 2A ⊗ H 2B and H 1A ⊗ H 1B .If the parity measurement outcome is even, H 1A and H 1B are kept and we continue to step 4. Otherwise we discard all pairs and start again from the beginning.
(4) We measure H 0A and H 0B separately in the Fourier {|+ , |− } basis and compare the results.If they show even parity, the scheme is successful.Otherwise we discard all pairs and start again from the beginning.
Although we have described operations during the protocol as occurring among certain Hilbert spaces, the above protocol also works for other combinations of the Hilbert spaces.For example, the control-SWAP in step 1 can instead be controlled by particles in Hilbert spaces H 1A ⊗ H 1B which targets H 2A ⊗ H 2B and H 3A ⊗ H 3B , then the two DEJMPS steps are carried out between states in This variation defines a whole set of protocols (there are in fact 12 such protocols), which we denote as S, each having a similar structure.

A. Difference between routing χ 3 and swapping χ 1 and χ 2
One may be tempted to infer that the two circuits in Figs. 4 and 5 are equivalent: whether routing χ 3 around while keeping χ 2 and χ 1 still (in Fig. 4) or swapping χ 2 and χ 1 around while keeping χ 3 still (in Fig. 5) should not matter, as they both simulate χ 3 going through DEJMPS distillations with χ 2 and χ 1 in two opposite orders.This is not actually the case, due to the parity measurements that are part of the distillation protocol.In Fig. 4's circuit, the two DEJMPS steps, each associated with χ 2 or χ 1 , can have different parity measurement results (00 or 11, under the even-parity requirement), and that the parity measurement results associated with χ 2 and χ 1 are the same for both causal orders (whether χ 2 is used after or before χ 1 ).In Fig. 5's circuit, however, the first DEJMPS steps (regardless of whether it is with χ 2 or χ 1 ) performed in both causal branches have the same parity measurement outcome, and so do the second DEJMPS steps in both causal branches.This means that if the CP trace-decreasing map of each DEJMPS step is expressed as Eq. ( 7), which is a sum over projections onto |00 and |11 , the overall state before the vertical dashed line in Fig. 5's circuit cannot be written as Eq.(10), where the effective Kraus operator Êij has components that are exact swapping of the Kraus operators of the two DEJMPS steps.We therefore cannot interpret the distillation output of Fig. 5's circuit as χ 3 going through two DEJMPS steps in an indefinite causal order, if the completely positive map of the DEJMPS protocol is given as Eq. ( 7).

B. Completely positive trace-decreasing maps under swapping of χ 1 and χ 2
In this section, we are able to show that Fig. 5's circuit can be interpreted as χ 3 undergoing two DEJMPS steps in an indefinite causal order, if the DEJMPS protocol is instead described by Eq. ( 9).We point out that for Bell-diagonal states as being considered in this paper, both (7) and ( 9) are equally valid quantum maps describing the DEJMPS protocol since mathematically they produce the exact distillation output as defined by the DEJMPS circuit [21].In this section, we assume the control pair is χ 0 which control-SWAPs χ 1 and χ 2 .This makes Following the circuit in Fig. 5, one can show that at the dashed line before the final Hadamard gates and parity measurements, the state in where i, j∈00,11 and The explicit expressions of T 1 , T 2 , L 1 , and L 2 are given in Sec.V C, which can be found by tracing the circuit in Fig. 5 in case the control pair is in state |01 or |10 .Theorem 1. N 1 and N 2 , the (unnormalized) distillation product states from two DEJMPS steps in opposite causal orders can be expressed as while M 1 and M 2 , the (unnormalized) entangled states correlated with the off-diagonal terms between the two basis states of the control pair that C-SWAPs χ 1 and χ 2 , can be expressed as where Kraus operators are equivalent to the Kraus operator of a single DEJMPS step that only projects the parity-measured states onto |00 , up to extra SWAP gates that relabel the Hilbert spaces of the subsystems.
Proof To show that N 1 , N 2 , M 1 and M 2 can be expressed as Eqs.( 23) to (26), two major steps are carried out.We first show that both outcomes of parity measurements (00 and 11) yield the same distillation output.This allows us to express mixture over projections onto "00" and "11" as only projecting onto "00."Secondly, we intersperse gates inside Ôi , Pi , and F i with additional SWAP gates to relabel some of the involved Hilbert spaces in order to arrive at Q1 and Q2 .
For convenience of description, we introduce a more compact notation that denotes χ i as where It is not immediately clear that the two parity measurement outcomes (00 and 11) are correlated with equivalent output state and as a result, it is not clear that the sum over "00" and "11" in Eqs. ( 16)- (19) can just be re-expressed as projecting only onto "00."To see this, we first examine the inner parts of N 1 , N 2 and M 1 : Ôi  15)-( 19)], the global phases generated during the parity measurement become relative phases among the above three terms, which can affect the final distilled state nontrivially.However, the fact that the CNOTs preserve the sign of the target Bell state means after the CNOTs, the Bell states on the "bra" and "ket" sides must have the same sign (b 3 ) regardless of the control Bell state of the CNOTs on the two sides.The global phases induced on the two sides cancel to give an overall "+1" phase, making the "00" and "11" outcomes have equivalent effect onto the leftover Hilbert space.Mathematically, the following relations are obtained: Ô00 Υi † F 11 † for i ∈ {00, 11}, where ˆ and Υ are either Ô or P. Since this is true for a,b of arbitrary a, b, it is also true for ρ in , which is a mixture of a,b of different a and b.We can now define F = √ 2 F 00 , Ô = √ 2 Ô00 and P = √ 2 P00 and express N 1 , N 2 , M 1 and M 2 as ρ in going through single Kraus operators, rather than mixtures over Kraus operators that project onto different parity-measurement outcomes 00 or 11: We now want to express N 1 as ρ in passing through two CP trace-decreasing maps in one order, and N 2 as ρ in undergoing the same two maps in the opposite order.We have defined a DEJMPS step, which has its distinct CP trace-decreasing map and Kraus operator, solely with respect to the entangled pair (χ 1 or χ 2 ) that χ 3 is combined with.In Fig. 5's circuit, however, the faulty state subject to the second distillation step (regardless of which entangled pair it is combined with) is in Hilbert space H 2A ⊗ H 2B , which is different from the Hilbert space of χ 3 (H 3A ⊗ H 3B ) during the first distillation step.Since operations on distinct Hilbert spaces are described by different operators, this prevents the same Kraus operator that describes the 1st distillation step in one causal order from also describing the 2nd distillation step in the other causal order.To resolve this, we add extra SWAP operations to F , Ô, and P such that the state produced from each DEJMPS step is always in Hilbert space H 3A ⊗ H 3B , regardless of whether the DEJMPS step is done as the first or second in the queue.Consider the term N 1 .We add two SWAP gates SWAP 3A 2A SWAP 3B 2B between the CNOTs and parity measurement of operator Ô [in Eq. ( 20)].In order for the circuit's output to remain unchanged, this must also change the projected Hilbert spaces of parity measurement of Ô from H 3A ⊗ H 3B to H 2A ⊗ H 2B , and as a result the unmeasured state is now in H 3A ⊗ H 3B .The modified Ô is shown to equal Q1 .The Hilbert spaces on which gates in the subsequent Kraus operator F act are also swapped between H 3A ⊗ H 3B and H 2A ⊗ H 2B .We then add two other SWAP gates SWAP 3A 1A SWAP 3B 1B between the CNOTs and parity measurement of F , which changes the parity-measured Hilbert space to H 1A ⊗ H 1B .This turns F into Q2 .We emphasize that the additional SWAP gates are introduced merely for algebraic reasons and are not implemented physically.
As for the term N 2 in Eq. ( 32), one can see from the construction of Ôi and Pi (hence Ô and P) in Eqs. ( 20) and ( 21) that N 2 is essentially N 1 with a swapping of labels of Hilbert spaces H 2A ⊗ H 2B and H 1A ⊗ H 1B .One can then conclude that if the same procedure of addition of SWAP gates are carried out on N 2 , the result will be that on N 1 [in Eq. ( 23)] also followed by a relabelling of Hilbert spaces H 2A ⊗ H 2B and H 1A ⊗ H 1B , which is exactly equal to an exchange of Q1 and Q2 , leading to Eq. ( 24).The manipulations on M 1 and M 2 follow a similar description that leads to Eqs. ( 25) and (26).One can see from Eqs. ( 27) and ( 28) that Q1 and Q2 are essentially the Kraus operator Ô given in Eq. ( 9) but with extra SWAP gates that relabel the Hilbert spaces of some of the states which do not affect the distillation output.
According to Eqs. ( 23) and ( 24), one can regard N 2 as the input state ρ in undergoing trace decreasing maps Q1 and Q2 in the opposite order as that in N 1 .Overall, the circuit in Fig. 5 before the final Hadamard and parity measurement can be expressed as where with Ĵ and Ŝ being the operators that act on ρ in in case the control pair is in state |01 and |10 .Equation ( 36) features opposite orders of Q1 and Q2 correlated with different states of the control pair.We have hence shown the circuit in Fig. 5 simulates two DEJMPS steps applied in a causal order controlled by χ 0 .

VI. ADVANTAGE IN FIDELITY AND PROBABILITY OF SUCCESS
We now compare, for four given input faulty states, the output fidelity from our set of protocols S against the concatenated DEJMPS protocols (denoted as set G) and the three-pair distillation protocols (denoted as set J ).We point out that previous works studying indefinite causal structures in quantum information processing tasks have argued their advantage by comparing them against definite causal structures that consist of the same number of elementary CP maps.This can be justified by treating the control state in the quantum switch as free resource, which is reasonable in some experimental setups that are used to implement these tasks (e.g., interferometers, where the control state is simply the propagation paths of photons).Here in our set of protocols S, the control of causal order is carried out by an entangled pair.This extra pair should not be seen as free, but as costly as the other entangled pairs involved in the protocol since they all practically take similar physical resources to generate.Therefore the comparison of protocols S should not be made against only the concatenated two DEJMPS steps, but against the set of definite-causal-order protocols that also turn four faulty entangled pairs into one.These protocols are exactly those that form the set of protocols G and J .We note that we do not compare our protocols against those entanglement distillation protocols that utilize four entangled pairs together (such protocols can usually be constructed from four-bit quantum error-detecting codes) as they require four memory units to carry out, which is a higher requirement than the above-mentioned protocols.
For protocols in S, We note that the output of the protocol ρ + from Eq. ( 39) is a mixture of five terms: N 1 , N 2 , M, T , and L (where we can express M as In order for ρ + to have a fidelity advantage over the distillation products of all protocols in G, at least one term among M, T , and L must have a fidelity larger than both N 1 and N 2 , since N 1 and N 2 themselves are the product of distillation from two definiteordered DEJMPS steps, which are member protocols of the set G. In this section, aside from comparing the fidelities and probabilities of success of protocol set S and G, we present that the advantage in fidelity solely comes from M (rather than from T and L), owing to the nontrivial commutation (i.e.[ Q2 , Q1 ] = 0) between the Kraus operators of the two maps that describe the two DEJMPS steps.This is the same origin as that of the advantage of a quantum switch as reviewed earlier in Sec.II.This confirms that the advantage in fidelity of our scheme is indeed due to applying entanglement distillation maps in a coherent superposition of two causal orders.
We first present an discrete example of input state χ i where protocol set S returns a higher fidelity and success probability than protocols in G and J , showing clear overall advantage of protocols S. It is discovered that the advantage occur on faulty states with noise biases close to that of Werner states.We then present the parameter region of input fidelities where S s advantage holds assuming the input states are Werner states, and also comment on S s distillation performance when input states have biased noise.

Werner states as input
There is a continuous region of input state parametesr where the advantage holds.We first consider the case when the faulty pairs all experience depolarising noise from | + which turns them into Werner states, which have the form We search over the parameter space For each set of parameters F, we find, over all protocols in S, the one that gives the maximum fidelity (denoted as F S ) and also find its probability of success (denoted as p S ).The same procedure is carried out for the protocol sets G and J , where the maximum fidelities in G and J are denoted as F G and F J with corresponding success probabilities p G and p J .Two searches using the basinhopping algorithm [36] which minimize F G − F S and F J − F S respectively are carried out within the parameter space .The input fidelities where our protocols show advantage are where F G − F S < 0 and F J − F S < 0. Mathematically, each inequality corresponds to a region in and the intersection of the two regions are recorded.
As an example, a point in the intersected region reads F = [0.5390,0.6332, 0.6332, 0.5888].When the four input states have those fidelities, the protocol set S produces a state (0.6853, 0.0802, 0.0802, 0.1543) T with fidelity F S = 0.6853 and probability p S = 0.2121.To obtain thus a state, the entangled state χ 0 , which has fidelity 0.539, is used to control-SWAP χ 1 and χ 2 , which have fidelity 0.6332.Among all protocols in G. the one with a "recurrence" structure first combining χ 0 with χ 1 and χ 2 with χ 3 before combining their purified products yield a state (0.6842, 0.0553, 0.1314, 0.1291) T with F G = 0.6842 and p G = 0.2069.Among all protocols in J , the arrangement which first puts χ 0 and χ 1 together for a DEJMPS step followed by a three-pair distillation circuit among its product and χ 2 and χ 3 yields a maximum fidelity F J = 0.6842 and success probability p J = 0.2069 from a state (0.6842, 0.1314, 0.1291, 0.0553) T .Our set of protocols S have clear overall advantage over the causally-ordered distillation protocols by producing a state with a higher fidelity and higher success probability than the latter.
As presented in Eq. ( 39), ρ + is a weighted mixture of N 1 , N 2 , M, T , and L, each of which being an unnormalized mixture of the four Bell states.We calculate the fidelities of all the mixture components of ρ + and they are: M is the only component with a fidelity higher than N 1 and N 2 , which are the fidelities of the entangled state produced from doing two definite-ordered DEJMPS steps.This means the fidelity advantage of protocols S is solely due to the presence of the component M. As discussed in Sec.V, the presence of M is due to the nontrivial commutations of the Kraus operators that correspond to the maps of the two distillation steps.This shows that the fidelity advantage of protocols S indeed comes from two DEJMPS steps being applied in an indefinite causal order.

Input states with noise bias
We study the effect of noise bias of the faulty pairs on the distillation fidelities.Noise on the entangled pairs comes from interaction of the particles with environment, during which the target entangled state becomes entangled with the environment.Tracing out the latter leaves the former in a probabilistic mixture of various states.The Pauli-diagonal channel, which results in the mixture components being the target state undergoing Pauli X, Y , and Z flips, is a fairly complete description of all possible noise models.In practice, the probabilities of undergoing the three flips are different.We define the X-biased channel with a degree r X as the following CPTP map: (45) r = 1 indicates a noise channel with complete X -bias, r = 0 indicates the noise biased away from X and r = 1/3 indicates a depolarising channel with equal noise probability, which gives rise to the previously discussed Werner states.A Ybiased noise channel with degree r Y and a Z-biased channel with degree r Z can be defined in the same manner.Suppose the fidelities of the four faulty pairs are the ones given in the previous discussion (F = [0.5390,0.6332, 0.6332, 0.5888]) but each faulty pair now has unequal erroneous Bell state components as caused by the noise bias.Figure 6 shows the distillation fidelities F G , F S and F J of the three groups of protocols under the three directions of noise bias: (a) for Y -biased noise, (b) for X -biased, and (c) for Z-biased noise.
Protocol S s fidelity is larger than G s and J for relatively unbiased noise model (when r X , r Y and r Z ≈ 1/3).The fidelity advantage of S is lost when noise is biased towards or away any of the three directions.We notice in (a) that when noise is biased heavily towards Y , using protocols G results in limited fidelity improvement compared to to S and J .Similarly in (b), when noise is biased heavily towards Y or X , using J results in small fidelity enhancement.Comparatively, our set of protocols S result in some amount of fidelity increase under any noise-bias direction.This indicates protocols S can be advantageous when only fidelities of the input faulty pairs are known but one has little information on the shape of the noise.

B. Advantageous region of parameters
We have seen that the fidelity advantage of S exists when input faulty pairs are close to Werner states.In this section, we restrict them to be Werner states and examine the parameter region inside where fidelity advantage holds [i.e., max(F G − F S , F J − F S ) < 0].In general, the region is a four-dimensional subspace in .To visualize the region, in Fig. 7, we fix F 3 to be equal to (a) 0.5390 and (b) 0.5690 and show the three-dimensional subspace of [F 0 , F 1 , F 2 ] bounded by the closed surface.One can see from Fig. 7 that the threedimensional advantageous region is larger when F 3 = 0.5390 than F 3 = 0.5690.We have also found (not shown in Fig. 7) that the advantageous region only exists when F 3 > 0.5.When F 3 is close to 0.5, the region is small.The size of the region FIG.6. Distillation fidelities F S (solid blue curve), F G (dotted red curve), and F J (dashed green curve) for the three groups of protocols S, G and J when the four input faulty states have respective fidelities F = [0.5390,0.6332, 0.6332, 0.5888] with the same noise bias towards: (a) Y (bit and phase) flip, (b) X (bit) flip, and (c) Z(phase) flip.
increases to F 3 ≈ 0.5390 after which it decreases to zero when F 3 ≈ 0.75.
In Fig. 7, the advantageous regions (given by the threedimensional surfaces) show a discrete three-fold rotational symmetry around the axis F 1 = F 2 = F 3 .This is expected, as our search algorithm iterates through all protocols within each class of protocols, which include all permutations of the given faulty pairs.As a result, any permutation of given fidelities that are in the advantageous region is also in the advantageous region.One can see from the figures that the regions only exist at some distance away from the axis of symmetry, and that there seem to be two "parts" of the surface: one part lying at lower fidelities and surrounds the symmetry axis, the other part are longer and extends into higher fidelities.
We now present the concrete entanglement distillation protocol within each protocol set that leads to the advantageous region.We choose F 3 = 0.5390, F 2 = 0.5888.Figures 8(a), 8(b) and 8(c) show the protocols within G, S, and J that leads to the maximum distillation fidelities F G , F S , and F J for various F 0 , F 1 .The black contours in each plot are the advantageous regions of S, which is essentially a "horizontal slice" of Fig. 8 at F 2 = 0.5888.We suggest the reader consult the figure's caption for more information.Here, we point out several features of the graph.In Fig. 8(b), The entirety of each black contour lies within a single region which represents a specific permutation of faulty pairs.As an example, the previously presented F = [0.6332,0.6332, 0.5888, 0.5390] which has (F 0 , F 1 ) = (0.6332, 0.6332) and belongs to the contour in the top-right corner has a permutation "(0,1,2,3)", which uses entangled pair 3 (with a fidelity F 3 = 0.5390) to C-SWAP the zeroth and first entangled pairs (with fidelities 0.6332).Interestingly, we find in Fig. 8(b) that it is always the entangled pair with the lowest fidelity being used to C-SWAP the two highest-fidelity pairs that lead to the highest distilled fidelity in S. Second, one can also see from Figs FIG. 8. Concrete distillation protocols within each set of protocols (a)G, (b)S, (c)J that lead to maximum fidelities for four Werner states with various F 0 and F 1 when F 3 = 0.5390 and F 2 = 0.5888.Within the black contours are where protocol S has fidelity advantage over G and J .Meaning of notations in the legends that denote the protocols are given as below.(a) A single number i in a pair of parentheses means the ith entangled pair (with fidelity F i ) is simply taken as the output and all other pairs are discarded.Two numbers i and j in a pair of parentheses means the ith and jth entangled pairs are DEJMPS-distilled.If there is an outer pair of parentheses present, the distillation product of the inner pair of parentheses is used as the input to the DEJMPS step described by the outer pair.(b) "(i, j, k, l )" denotes the C-SWAP protocol where lth entangled pair C-SWAPs the ith and jth entangled pairs.(c) Two numbers in a pair of parentheses mean those two entangled pairs are DEJMPS distilled.Three numbers in a pair of parentheses mean those pairs are distilled using the three-pair distillation circuit as given in Fig. 2 with the order of appearance before the circuit the same as the order with which the corresponding number appears in the parentheses.If an outer pair of parentheses encompass an inner pair, the distillation product of the inner parentheses is used as the input to the distillation step described by the outer parentheses.A letter with a prime represents a distillation protocol denoted with "0" and "1" swapped compared with the letter without prime.For example, F in (a) represents a protocol denoted by ((1,3),(0,2)) whereas F denotes the protocol ((0,3),(1,2)).
(with fidelity 0.6332) and second faulty pairs (with fidelity 0.5888) and discards the third pair (with the lowest fidelity 0.5390).In contrast, our protocols S do not discard the lowest-fidelity pair but use it to further enhance the fidelity of distillation.This suggests our protocols may use the entangled pairs more efficiently, which is beneficial when rate of entanglement distribution is low.

VII. DISCUSSIONS
We have presented and studied the practical benefit of applying indefinite causal order in the task of entanglement distillation, which is an important and necessary protocol in practical quantum communication.When four faulty entangled pairs subject to Pauli noise are shared, we have constructed a protocol where one faulty pair is used to control-SWAP two other faulty pairs before two steps of the basic DEJMPS entanglement distillation protocol are applied onto a fourth faulty pair and the two SWAP-ed pairs.We have shown that the constructed protocol can be seen as applying two DEJMPS steps in a superposition of two causal orders.This is done by showing that the overall trace-decreasing map of the protocol can be expressed in the same form as a quantum switch which indicates the presence of indefinite orders of two trace-decreasing maps of the constituent DEJMPS steps.It is also shown that for some input faulty states, the protocol returns an output entangled state with higher fidelity and success probability than a wide range of protocols constructed from concatenation of smaller entanglement distillation protocols that follow a definite causal order.This includes concatenation of multiple DEJMPS steps, and concatenation between a DEJMPS step and a typical three-pair distillation protocol constructed from three-bit stabilizer quantum error-detecting code.The circuit of our protocol has relatively low complexity, making itself viable to implement and demonstrate experimentally.
We believe effort should be made into understanding whether, at least for the examples we presented in this paper, the advantage of fidelity/probability of success is really due to indefinite causal order per se, or can it be replicated/exceeded with definite causal order protocols.This is to complement the recent debate on indefinite-causal-order advantage in other application settings.For example, the effect of noise reduction from putting two noisy Pauli channels in an indefinite causal order as presented in Ref. [11] was later found to be matched in Ref. [37] with a setup (Fig. 6 of the paper) which consists of the two noisy channels arranged in a definite causal order, but there is an extra noiseless side channel generated from the control state via an entangling gate between the control and target states.The authors of Ref. [37] argue such a setup shows that Ref. [11]'s noise advantage is not due to indefinite causal order per se because it can be realized with other kinds of causally-ordered resources.Here in our practical setup, the meaning of "resource" and what count as "free resource" is different and more specific.In the task of entanglement distillation, local unitary gates and classical communication are seen as free resource, and faulty entanglements are not free (as they are hard to generate).The quantum switch in our circuit should be seen as free resource as it is realized purely with local unitary gates.In comparison, other "free resources" that feature "definite causal order" in our setting are other types of entanglement distillation protocols outside of the sets G and J .These protocols can be built from scratch using local unitary gates.Another piece of work that questioned the sole-advantage of quantum switch in noise reduction is [38].The authors proposed a setup which simply puts each of the two channels on two separate paths and let the photon propagate through a superposition of the two paths.For the task of entanglement distillation, such a spirit can be realized as the following protocol: using a faulty Bell pair (which corresponds to the control qubit in Ref. [38]) to C-SWAP two other faulty pairs before a distillation step is carried out between a fourth faulty pair and one of the SWAPed pairs, followed by a measurement on the control pair.This simulates a process in which the fourth pair is distilled with a superposition of either one of the two swapped pairs followed by a postselection via measuring the control pair.Compared with protocol S s circuit in Fig. 5, one can see that this is simply a sub-routine of that circuit which does not have the second DEJMPS step at the end.This indicates it is unlikely to surpass the performance of the our protocols S.
On the other hand, there are numerous possible extensions of our work.A natural and immediate one is to compare the advantage of fidelity and success probability of applying > 2 DEJMPS steps in an indefinite causal order over definite-causal-order protocols that distill the same number of faulty pairs.This will require multiple Bell states, or a smaller number of higher-dimensional entangled states to act as the control state of the quantum switch.Alternatively, one may also consider still using one faulty Bell state as the control state by keeping the number of constituent CP maps in the superposition at two where each constituent CP map will be an entanglement distillation protocol that uses more faulty pairs.Extensions to superposing the causal orders of multipartite entanglement distillation protocols and one-way entanglement distillation protocols can also be carried out.Given the known connection [22,31] of one-way entanglement distillation protocols with stabilizer quantum error correction codes, we hope the possible existence of advantage can stimulate effort into incorporating indefinite causal structures into the encoding/decoding of quantum error-correcting codes, which is a vital part of practical quantum information processing.Additionally, modifications similar to our proposal can be envisioned for many other distillation-like and breeding-like protocols that feature repeated applications of some subroutine, each featuring a subsystem interacting with ancillary states.Some examples include: distillation of magic states for universal quantum computation [39,40], and repeated breeding of oscillator states [41] for distillation of bosonic quantum error-correcting codewords.

APPENDIX: TWO QUANTUM TELEPORTATION STEPS APPLIED UNDER INDEFINITE CAUSAL ORDER
One noticeable proposal for application of indefinite causal order in quantum communication is putting two teleportation steps in a superposition of causal orders.The standard quantum teleportation protocol [42] has a | + state shared between Alice and Bob.For Alice to teleport her qubit |ψ , she performs a Bell measurement between |ψ and her qubit of | + .The measurement result is sent to Bob via a classical channel.Bob maps the measurement result onto a Pauli operator which he performs on the teleported state to recover |ψ .It is known from [43] that when the shared entangled state is noisy, the above protocol is essentially |ψ undergoing a generalized depolarising channel.Two noisy teleportation steps applied in series naturally degrades |ψ more than one step.It was claimed, however, in Refs.[18][19][20] that applying two teleportation steps in an indefinite causal order reduces the noise of the final target state as compared to the two steps applied back-to-back.In Ref. [19], a photonic implementation of this scheme is proposed where entangled photon pairs pass through beam splitters and subsequent Bell measurements such that a distinct causal order is featured on each output path of the beam splitters.On one of the two paths, SWAP gates are performed on the photon pairs to simulate the swapping of teleportation channels.The author in Ref. [19] considered the case when the entangled pair is pure but not maximally entangled.In this section, we examine their proposals [18][19][20] more carefully and show that no noise reduction of the target state actually occurs.
We first consider two concatenated standard teleportation steps with the input target state |ψ ψ| in Hilbert space H 0 , whose circuit in shown in Fig. 9.A pure but not maximally entangled state χ is in Hilbert space H 1 ⊗ H 2 : χ = mn σ (2)  m | + + |σ (2)  n q mn (A1) where σ m and σ n are Pauli operators, the superscript "2" means the Pauli noise occurs on the qubit of χ in Hilbert space H 2 as denoted in Fig. 9. q mn denotes the entries of the density matrix under the Bell basis.A second pure yet not maximally entangled pair ξ lies in Hilbert space H 3 ⊗ H 4 : The superscripts "4" again denotes the Hilbert space on which we assume Pauli noise occurs.The first teleportation step carried out with χ and the original target state |ψ ψ| yields ρ = nmi σ (2)  i G (01)   i |ψ ψ| ⊗ σ (2)  n | + + |σ (2)  m q mn G (01) i σ (2)   i (A3) where G (01) i is the Bell measurement projector associated with outcome i and σ (2)   i is the corresponding Pauli correction applied on the output Hilbert space.We now want to express the errors on ρ in terms of the errors σ m , σ n on χ .The Pauli errors σ n and σ m commute with G (01) i as they act on different Hilbert spaces.They commute with σ i if m(n) = i, or anti-commute with σ i if otherwise.This means ρ = nmi σ (2)   n σ (2)  i G (01)   i |ψ ψ| ⊗ | + + |G (01) i σ (2)   i σ (2)  m A ni A im q mn = nmi σ (2)  n |ψ ψ|σ (2)  m A ni A im q mn , (A4) where A is a global phase caused by permuting σ m,n with σ i .
The second line is due to the part in the large parenthesis is simply |ψ ψ| which comes from a perfect teleportation with noiseless | + .We note that i A ni A im = δ nm .This eliminates all summing components where n = m and we arrive at ρ = m σ m |ψ ψ|σ m q mm , (A5) which, as expected from [43], is the original state |ψ undergone a depolarising channel.ρ now goes through the second teleportation channel which has an output ρ expressed as ρ = mm n k σ (4)  k G (23)   k σ (2)  m |ψ ψ|σ (2)  m q mn ⊗ × σ (4)  m | + + |σ (4)  n s m n G (23)  k σ (4)  k , (A6) where the superscripts denote the Hilbert spaces of the corresponding operations.Like before, we move σ (4)  m and σ (4)   n across the Pauli corrections during which global phases A m k and A n k arise: ρ = mm n k σ (4)  m σ (4)  k G (23)   k σ (2)  m |ψ ψ|σ (2)  m q mn ⊗ | + + | × s m n G (23)  k σ (4)  k σ (4)  n A m k A n k .(A7) We notice that ρ now can be interpreted as the inner state σ (2)  m |ψ ψ|σ (2)  m passing through a perfect teleportation channel followed by σ (4)  m on the ket side (or σ (4)  n on the bra side).Since a perfect teleportation preserves the target state, a Pauli error σ (2)  m commutes with a perfect teleportation.Having an error on the input target state has the same effect as having the error on the perfectly teleported state.We can hence move σ (2)  m across σ (4)  k and G (23)  k , giving ρ = mm n k σ (4)  m σ (4) m σ (4)  k G (23)  k |ψ ψ| ⊗ | + + |G (23)  k σ (4)   k × σ (4)  m σ (4)  n A m k A n k q mn s m n = mm n k σ (4)  m σ (4)  m |ψ ψ|σ (4)  m σ (4)  n A m k A n k q mn s m n = mm n k σ (4)  m σ (4)  m |ψ ψ|σ (4)  m σ (4)  n δ m n q mn s m n = mn σ n σ m |ψ ψ|σ m σ n q mm s nn .(A8) We now calculate the state produced from the scheme proposed in [19]  σ (4)  k G (23)   k σ (2)  i G (01) i |ψ ψ| ⊗ σ (2)  m | + + |σ (2)  n q mn G (01) i σ (2)   i ⊗ σ (4)  m | + + |σ (2)   n × s m n G (23)  k σ (4)  k + |1 0| 2 ⊗ mnm n ik σ (4)  k G (23)   k σ (2)  i G (01) i |ψ ψ|⊗ × σ (2)  m | + + |σ (2)  n q mn G (01) i σ (2)   i ⊗ σ (4)  m | + + |σ (2)  n s m n G (23)  k σ (4)  k . (A9) For the last two terms in the summation, we follow a similar strategy by moving the Pauli errors on the entangled pairs pass the Pauli corrections where global phases arise from commutation (or anticommutation).This gives The fact that χ and ξ are pure states means that s mn and q mn are factorizable: they can be written as s mn = u m u * n and q mn = v m v * n where u = {u m } and v = {v m } are normalized probability amplitudes of each Bell component such that |u| = |v| = 1.We consider a common situation (which is also the case considered in Ref. [19]) where the two faulty pairs, χ and ξ , are identical.This is motivated from the speculation that they may come from the same single photon generator.This means u = ve iφ with some constant phase factor φ. We then have q mn s nm = v m v * which yields an output state ρ , the same as that from two definite-order teleportation steps, regardless of the basis and outcome of measurement on the control qubit.no noise reduction has occurred.We leave as future work to examine the more general case when χ and ξ are mixed states.We expect, however, more practical challenges to implement the scheme in Ref. [19] in this case.This is because the mixed Pauli noise is likely to occur during storage or transmission of the entangled states.It is not hard to see that in order for noise interference to occur due to indefinite causal order, controlled-swapping of the two entangled pairs must happen after, not before the mixed Pauli noises.This means if, for example, the major source of noise comes from the physical communication channel during transmissions of the entangled pairs, the controlled-swap will have to be carried out between remote parties.Suppose the two faulty entangled pairs are shared between Alice & Bob and Bob & Charlie, respectively, then extra perfect Bell pairs will have to be pre-shared between Alice & Charlie for the remote-swap.The required extra resources bring additional challenges to the practical implementation.

FIG. 2 .
FIG. 2. A three-pair distillation protocol constructed from a three-bit error detecting code stabilized by Ŝ1 = Î Ẑ Ẑ and Ŝ2 = X X X , with an extra initial rotation R = exp(−i π 4 X ).

FIG. 5 .
FIG.5.A simple circuit which seeks to simulate two DEJMPS steps in a coherent superposition of two causal orders.This is achieved by coherent swapping of χ 1 and χ 2 at the beginning of the circuit, in contrast to Fig.4where χ 1 and χ 2 are kept still and χ 3 is routed around.R = exp(−i π 2 X ) where X is the Pauli-X operator.R † is the Hermitian conjugate of R. Ĥ is the Hadamard gate.
where a ∈ {0, 1} denotes the parity of the Bell component and b 0 ,|β 1,1 , |β 1,0 , |β 0,1 } = {| + , | − , | + , | − }. m (i)a,b s are the corresponding coefficients of the components which satisfy a,b m (i) a,b = 1 due to normalization.The input faulty states ρ in can be expressed as a mixture over components, each being a tensor product of three Bell states: Hermitian conjugate of M 1 ).These terms consist of initial rotations R on two Bell states, CNOTs and projection onto |00 or |11 .It is known from Ref. [21] that R preserves the diagonal structure of a,b : it merely permutes | − and | − , leaving | + and | + unchanged.The trailing CNOT gates hence still act on a Bell-diagonal state.The CNOTs turn |β a 2 ,b 2 |β a 3 ,b 3 into |β a 2 ,b 2 ⊕b 3 |β a 2 ⊕a 3 ,b 3 and preserve the sign of the Bell state (b 3 ) in H 3A ⊗ H 3B , where the subsequent first parity measurement is done.The sign of the Bell state b 3 is important.We note that projections of | + (the case where b 3 = 0) onto |00 and |11 both yield a trivial global phase.But for the "negative sign" | − (where b 3 = 1), projection onto |00 yields a trivial global phase, while projection onto |11 yields a "−1" global phase.Here, since the terms Ôi Ôi † are correlated with the control pair being in |00 00|, |11 11|, and |00 11|, respectively [see Eqs. ( FIG. 7. Regions of [F 0 , F 1 , F 2 ] with (a) F 3 = 0.5390 and (b) 0.5690 where protocols S have higher fidelity than G and J [i.e., max(F G − F S , F J − F S ) < 0].The dashed line is specified by F 0 = F 1 = F 2 .
which uses a control qubit |c = |+ = 1/ √ 2(|0 + |1 ) to swap χ and ξ , then does the two teleportation steps before measuring |c in the Fourier basis and postselecting the "+" outcome.Before measuring |c , the overall state R which consists of |c and teleportation target state |ψ reads R = |0 0| 2 ⊗ mn σ n σ m |ψ ψ|σ m σ n q mm s nn + |1 1| 2 ⊗ mn σ m σ n |ψ ψ|σ n σ m q mm s nn n u n u * m = v m (u * n e iφ )u n (v * m e −iφ ) = v m v * m u n u * n = q mm snn .Substituting this into Eq.(A10) and notice that σ n σ m |ψ ψ|σ m σ n = σ m σ n |ψ ψ|σ n σ m for any n, m since σ n and σ m either commute or anticommute, one can express R as R = |+ +| ⊗ ρ , (A11) Ô11 † .The leftover state in (H 1A ⊗ H 1B ) ⊗ (H 2A ⊗ H 2B ), after tracing out H 3A ⊗ H 3B , is also a tensor product of two Bell states, and is now subject to another round of local rotations, CNOTs and parity measurement.One can use the same argument as above to show that projecting onto |00 and |11 in the parity measurement yield the same state in the leftover Hilbert space H 1A ⊗ H 1B .This means F 00